
By using the MIT License, Xiaomi said it is allowing commercial deployment, continued training, and fine-tuning without additional authorization. Tulika Sheel, senior vice president at Kadence International, said the MIT License can make it attractive. “It allows enterprises to freely modify, deploy, and commercialize the model without restrictions, which is rare in today’s AI landscape,” Sheel said.
“On ClawEval, V2.5-Pro lands at 64% Pass^3 using only ~70K tokens per trajectory — roughly 40–60% fewer tokens than Claude Opus 4.6, Gemini 3.1 Pro, and GPT-5.4 at comparable capability levels,” Xiaomi said in a blog post.
The models use a sparse mixture-of-experts (MoE) design to manage compute costs. The 310-billion-parameter MiMo-V2.5 activates only 15 billion parameters per request, while the 1.02-trillion-parameter Pro version activates 42 billion. Xiaomi said the Pro model’s hybrid attention design can reduce KV-cache storage by nearly seven times during long-context tasks.
