19h ago

MiniMax Releases M2 Paper On Agent-Native MoE Training With Self-Debugging

37319353.9K

——0——

Original post

MiniMax-M2 paper just dropped The key focus of M2 is on something more agent-native. It trains on runnable workspaces and artifact-grounded rewards, then uses Forge to scale RL over long coding, app, search, and office-task trajectories. What's interesting is that M2.7 starting to debug failed training runs and edit its own agent scaffold. other key details: 29.9B MoE with 9.8B active, 256 fine-grained experts, full attention, 192K context, and MTP for speculative decoding. "we found no variant that reliably matches full attention quality in production settings"

8:49 PM · May 26, 2026