2d ago

Sebastian Raschka ranks 28 LLMs by active parameters

0

Sebastian Raschka posted a table from his LLM Architecture Gallery that ranks 28 large language models by active parameters per token. DeepSeek V4-Pro leads at 3.1 percent active parameters, followed by Kimi K2 variants at 3.2 percent and Qwen3 80B-A3B at 3.8 percent. The table shows active versus total parameter counts, model type, attention mechanism, and release dates. It offers one comparative view for sparse models while omitting KV cache size, routing overhead, context length, and hardware efficiency.

Original post

Meta observation: DeepSeek is still king of the active-parameter ratio

7:46 AM · May 14, 2026 View on X

Meta observation: DeepSeek is still king of the active-parameter ratio

2:46 PM · May 14, 2026 · 43.6K Views

The table in HTML format for easier (and non-truncated) viewing: https://sebastianraschka.com/llm-architecture-gallery/active-parameter-ratio/

Sebastian RaschkaSebastian Raschka@rasbt

Meta observation: DeepSeek is still king of the active-parameter ratio

2:46 PM · May 14, 2026 · 43.6K Views
3:37 PM · May 14, 2026 · 5.7K Views

@teortaxesTex good catch, will add

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

@rasbt Xiaomi 2.5 pro missing (4.2%, 1T)

8:55 PM · May 14, 2026 · 920 Views
9:08 PM · May 14, 2026 · 656 Views

@scaling01 This is a truncated one... but Google was on the bottom of the list

Lisan al GaibLisan al Gaib@scaling01

I wouldn't be surprised if Google was at 1-2% active

8:50 PM · May 14, 2026 · 20.2K Views
9:09 PM · May 14, 2026 · 3.3K Views

@scaling01 Likely all the incoming generation of frontier models. Extreme sparsity is economic viability.

Lisan al GaibLisan al Gaib@scaling01

I wouldn't be surprised if Google was at 1-2% active

8:50 PM · May 14, 2026 · 20.2K Views
8:57 PM · May 14, 2026 · 533 Views

I wouldn't be surprised if Google was at 1-2% active

Sebastian RaschkaSebastian Raschka@rasbt

Meta observation: DeepSeek is still king of the active-parameter ratio

2:46 PM · May 14, 2026 · 43.6K Views
8:50 PM · May 14, 2026 · 20.2K Views

@rasbt I mean the closed Gemini 3 models

Sebastian RaschkaSebastian Raschka@rasbt

@scaling01 This is a truncated one... but Google was on the bottom of the list

9:09 PM · May 14, 2026 · 3.3K Views
9:24 PM · May 14, 2026 · 739 Views
Sebastian Raschka ranks 28 LLMs by active parameters · Digg