Story Detail of id 48447744 | Liveview Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

E-Reverance5 hours ago | on: MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second

No

> It uses 384 routed experts (top-8) with hybrid attention (full-attention + sliding-window 128 at 6:1 ratio) over 70 layers (1 dense + 69 MoE)

https://recipes.vllm.ai/XiaomiMiMo/MiMo-V2.5-Pro