/Tech3h ago

DeepSeek Underrated as Few Labs Attempt Large-Scale Pretraining

4136397.4K

Original post

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex#501inTech

DeepSeek is underrated still There are very few labs that have even tried pretraining anything substantially larger than V4. Their struggles with getting it to work, on top of all the inane architecture tricks, make sense. In their situation, OpenAI would still be doing 671B.

4:18 PM · Jun 12, 2026 · 3.7K Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Posts from X

Most Activity

VIEWS1.9KBOOKMARKS2LIKES34REPLIES1

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

Kimi: 149% of a derisked DeepSeek architecture Zhipu: 110% of a derisked DeepSeek architecture Minimax: 63% of a simpler architecture than DS-V3.2 DeepSeek: 238% of an insane alien murder clown architecture This takes… conviction

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

3h1.9K342

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

I don't know about Qwen-Max, it shows Alibaba moves on a very aggressive schedule now, they get credit for doing their own thing and succeeding

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

3h1.8K280