/Tech2h ago

DeepSeek-V4 adopts deterministic hash routing for early MoE layers, replacing traditional learned routing

The architecture otherwise inherits its design directly from DeepSeek-V3

513021.4K

#72

Original post

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex#501inTech

that this works okay vs learned routing is indictment enough

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

I still wish we had something more globally aware than these routers MoEs are frustrating. What do you mean loss and knowledge scale with total params and "intelligence" with active? Wtf is non-knowledge-based intelligence in an LLM? That's not true humanlike sparsity.

1:52 PM · Jul 4, 2026 · 1.1K Views

Sentiment

Sentiment building, check back later.

Cluster Engagement