Testing 8x B200s with GLM5.2 in FP8 and just hit 150 tok/sec out of the box. This thing is cooking. Open weights will eat the world. 🔥
4:00 PM · Jun 23, 2026 · 10K Views
NVFP4 and Dynamo optimizations pushed speeds past 280 tokens/sec
Testing 8x B200s with GLM5.2 in FP8 and just hit 150 tok/sec out of the box. This thing is cooking. Open weights will eat the world. 🔥
No Digg Deeper questions have been answered for this story yet.
They got GLM-5.2 running at 280+ tok/s this is more than 4x faster than what I get on 6000s
- Nvidia dynamo - custom NVFP4 - improved implementation of MTP
Giving this to codex, will see if I get a boost
http://x.com/i/article/2069201301456707584