/Tech7h ago

DeepSeek V4-Flash API reportedly reaches speeds of 100 tokens per second, up from a 22 t/s baseline

Engineers are debating the underlying optimizations driving the speed gains

14119099.5K

#501

Original post

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex#501inTech

Btw I think V4-Pro has modestly accelerated V4-Flash is at 100 t/s, and indeed seems more token-effective This is really nice after what feels like years of DS API at 22 t/s

4:13 PM · Jun 16, 2026 · 2.3K Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS4.5KBOOKMARKS5LIKES54REPLIES6

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

Why would DeepSeek get 22-40% faster? saw up to 110 t/s Flash, up to 90+ on Pro Inference optimizations, like at other labs? I would think they've already optimized the hell of it for RL alone, they had built this architecture for speed. Smaller bs? Or new hardware at last?

5h4.5K545

kalomaze@kalomaze

@teortaxesTex better speculation?

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

5h950140

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

…on second thought I guess it might be optimization actually

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

5h1.8K150