It's one thing to see peak inference tok/s on @ArtificialAnlys
It's a completely different thing to have this sustained across real world usage.
@OpenRouter is the best place to see this rn, and... at least for now, @CoreWeave / @wandb is the fastest GLM you can get 👀⚡