/Tech9h ago

Skepticism Grows Over Chinese Ability To Train Multi-Trillion Parameter LLMs

226033.7K

Original post

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex#501inTech

Meta, OpenAI, xAI and Baidu all are known to have trained a >2T model (Behemoth, GPT 4.5, Grok 3/4, ERNIE 5). All have been flawed and eventually got replaced by smaller AND stronger ones. It's not clear to me anyone in China (or outside GDM/Ant) currently knows how to do this.

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

Heh, I did well baiting Elon to give this prediction. Anyway, Hamish is well-calibrated on the estimate but 1) I doubt any Chinese player will commit to its first Mythos-scale job outside Mainland. The risk of meddling is high. 2) we don't know if they *can* train a multi-T LLM

3:24 PM · Jul 3, 2026 · 2.4K Views

Sentiment

Some users praise the awesome knowledge of Chinese LLMs exceeding 2T parameters to counter skepticism about training multi-trillion parameter models.

Pos

100.0%

Neg

0.0%

1 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS1.3KBOOKMARKS2LIKES7

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

Devastating hit from Dario: Fable is not scale-pilled

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

6h1.3K72

REPLIES1

Underground Thinker@Donogzs

@teortaxesTex Are we sure GDM has trained a 2T model well? Leaks seem to imply Gemini 3 pro had the same base as 2.5 pro, which also explains the knowledge cutoff

9h1171

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

@Donogzs its knowledge is awesome and yes it's >2T

7h25