Cerebras places Kimi K2.6, a trillion-parameter model, into enterprise trials running at roughly 1,000 output tokens per second, the highest speed Artificial Analysis has recorded for any frontier model
Benchmarks show 981 tokens per second on 10,000 input tokens.
I remember when people were saying "It's useless to open-source big models because nobody will be able to run them fast"....
Cerebras is now running Kimi K2.6 – a trillion parameter model – in enterprise trials. At ~1,000 tokens/s, this is the fastest frontier model performance ever measured by Artificial Analysis @ArtificialAnlys.
Holy guacamole!
Cerebras is now running Kimi K2.6 – a trillion parameter model – in enterprise trials. At ~1,000 tokens/s, this is the fastest frontier model performance ever measured by Artificial Analysis @ArtificialAnlys.
The speed of intelligence is accelerating.
TPUs are insane
Gemini 3.5 Flash is running at ~867 tokens/s almost as fast as Kimi-K2.6 on Cerebras custom chips
Cerebras is now running Kimi K2.6 – a trillion parameter model – in enterprise trials. At ~1,000 tokens/s, this is the fastest frontier model performance ever measured by Artificial Analysis @ArtificialAnlys.

