9h ago

MiniMax-M2 research paper details a 229.9B-parameter architecture activating 9.8B parameters during inference

It details post-training systems including Forge RL and CISPO.

871.6K124262146.0K

——0——

Original post

#263@ANDREW_N_CARROP

RyanLee@RYANLEEMINIMAX

Recently, we took time to consolidate all of the work behind M2 and published it here: our M2 paper on arXiv It’s been just over six months since we first open-sourced M2 on December 23 last year. During that time, a number of our ideas and systems have been broadly adopted by the open-source community — including CISPO, Forge RL System, Self-Evolution. Over the past six months, we’ve felt incredible enthusiasm from the open-source community. Nearly every model release reached the #1 spot on the Hugging Face leaderboard. Now it’s time for a new chapter. We’re getting ready for M3. MSA paper is on the road. https://arxiv.org/abs/2605.26494

8:05 PM · May 26, 2026

Reposted by

#1988@DAN_JEFFRIES1

QUOTE POST

#420Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@TEORTAXESTEX

Very good practice to release a finished paper for a model *series*, not just for the foundation. Post-training is where most engineering effort happens today.

RyanLee@RyanLeeMiniMax

3:05 AM · May 27, 2026 · 79.4K Views

4:13 AM · May 27, 2026 · 4.2K Views

MiniMax-M2 research paper details a 229.9B-parameter architecture activating 9.8B parameters during inference

Cluster engagement

Sentiment