1d ago

MiniMax-M2.7 Open-Weight Model Matches GPT-5 Performance At 440 Tokens Per Second

0
Original post

For the first time, I feel open-weight models are impossible to ignore. We are at a point where these models are competitive with the best models out there. MiniMax-M2.7 is the latest beast to come out, and I'm running it at 440+ tokens/s. 230B parameters. It's a beast. Just for comparison, I've found that Gemma4 31b is good enough for many of the things I do, so imagine what an additional 200B parameters bring here. Of course, I can't run MiniMax-M2.7 locally, so I use SambaNova. • Extremely fast inference (probably one of the fastest in the market) • Extremely cheap (around 5% of what you'd pay for proprietary models) • MiniMax 2.7 scores 56.22% on SWE-Pro, 57.0% on Terminal Bench 2, and 76.5% on SWE Multilingual That puts it around the Opus 4.6/GPT-5.4 league, with the difference that MiniMax is open-weight. I recorded a real-time video of MiniMax-M2.7 running on SambaNova, so you can get a sense of how fast it is. I'm not using streaming here. The model is consistently running at ~440 tokens/second. Use this playground to test MiniMax M2.7: https://fandf.co/41uJXzw Thanks to the team for partnering with me on this post.

11:30 AM · May 15, 2026 View on X
Reposted by

For the first time, I feel open-weight models are impossible to ignore.

We are at a point where these models are competitive with the best models out there.

MiniMax-M2.7 is the latest beast to come out, and I'm running it at 440+ tokens/s.

230B parameters. It's a beast.

Just for comparison, I've found that Gemma4 31b is good enough for many of the things I do, so imagine what an additional 200B parameters bring here.

Of course, I can't run MiniMax-M2.7 locally, so I use SambaNova.

• Extremely fast inference (probably one of the fastest in the market) • Extremely cheap (around 5% of what you'd pay for proprietary models) • MiniMax 2.7 scores 56.22% on SWE-Pro, 57.0% on Terminal Bench 2, and 76.5% on SWE Multilingual

That puts it around the Opus 4.6/GPT-5.4 league, with the difference that MiniMax is open-weight.

I recorded a real-time video of MiniMax-M2.7 running on SambaNova, so you can get a sense of how fast it is. I'm not using streaming here. The model is consistently running at ~440 tokens/second.

Use this playground to test MiniMax M2.7: https://fandf.co/41uJXzw

Thanks to the team for partnering with me on this post.

6:30 PM · May 15, 2026 · 29.6K Views
MiniMax-M2.7 Open-Weight Model Matches GPT-5 Performance At 440 Tokens Per Second · Digg