/Tech5h ago

Dave Morin, Offline Ventures co-founder, benchmarked open-weights GLM5.2 on eight Nvidia B200 GPUs, hitting 150 tokens per second in FP8

NVFP4 and Dynamo optimizations pushed speeds past 280 tokens/sec

284941923852.6K

Original post

Testing 8x B200s with GLM5.2 in FP8 and just hit 150 tok/sec out of the box. This thing is cooking. Open weights will eat the world. 🔥

4:00 PM · Jun 23, 2026 · 10K Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Related links

X.COMVia

Posts from X

Most Activity

VIEWS10RETWEETS17

0xSero@0xSero

They got GLM-5.2 running at 280+ tok/s this is more than 4x faster than what I get on 6000s

- Nvidia dynamo - custom NVFP4 - improved implementation of MTP

Giving this to codex, will see if I get a boost

http://x.com/i/article/2069201301456707584

10h42.6K372223