/Tech8h ago

DeepSeek releases DeepSpec, a collection of accelerated open-weight models optimized for local deployment based on Qwen3 and Gemma4

Story Overview

DeepSeek has published the DeepSpec collection on Hugging Face, offering draft models meant to accelerate inference via speculative decoding when paired with Qwen3 and Gemma4 bases for local use.

16333186230.8K

#33

Original post

Florian Brand@xeophon#1778inTech

🫪

5:53 AM · Jun 28, 2026 · 19.4K Views

Developer Impact

Open weights and code hit Hugging Face together

The DSpark variants sit publicly available for anyone to download, with matching training and evaluation scripts released under MIT license on GitHub so developers can reproduce or extend the work.

Open Question

Speed claims stay untested in public benchmarks

No independent numbers on tokens-per-second gains or quality trade-offs have surfaced yet, leaving the practical upside for local setups as an open variable.

Sentiment

Positive users are excited about DeepSeek's accelerated Gemma4-12B and Qwen3 models because the draft heads and local performance gains feel like major leaps forward, while a few note remaining weaknesses such as non-linear attention.

Pos

87.5%

Neg

12.5%

5 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Related links

DeepSpec - a deepseek-ai Collection

HUGGINGFACEVia

#1778

Posts from X

Most Activity

VIEWS4.3K

Florian Brand@xeophon

9 PM in Beijing and someone in the whale office is dropping some dsparks, cc @teortaxesTex

Florian Brand@xeophon

🫪

8h4.3K360

BOOKMARKS14LIKES54RETWEETS5REPLIES4

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

Good guy DeepSeek gives us accelerated models The most interesting one here is Gemma4-12B, I presume vision included. Might be the best local model in its weight class now, by some margin Qwen 3.5 not included because DS[park] doesn't do linear attention I guess

Florian Brand@xeophon

🫪

1h3.9K5414

Florian Brand@xeophon

@teortaxesTex full collection: https://huggingface.co/collections/deepseek-ai/deepspec

Florian Brand@xeophon

9 PM in Beijing and someone in the whale office is dropping some dsparks, cc @teortaxesTex

8h2.2K208

cedric@cedric_chee

Full collection: https://huggingface.co/collections/deepseek-ai/deepspec

8h451

cedric@cedric_chee

DeepSeek preparing release of DSpark, DFlash and Eagle draft models for Qwen3 and Gemma-4 variants

8h1.9K93

Mazinho@mazoboy

@xeophon Nice, but why not the newer Qwen family? 🫠

5h8041

clem 🤗@ClementDelangue

https://huggingface.co/collections/deepseek-ai/deepspec

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

4m27100

cedric@cedric_chee

The released checkpoints are the ones used in the DSpark paper

7h92

Ozar@ozarliquid

@teortaxesTex wait the vision model is the play?

locals keep getting scarily good for self hosted tbh

14m18

K@kmorinight

@cedric_chee sooo many big tech words rn i am literally just sipping my cold brew in bed but im cheering u on ✨💕

57m5

Sakura Yuki@sakurayukiai

@teortaxesTex A 3B draft head for a 12B model is wild. We're spending 25% of our parameter budget just to guess what the main model is going to say next, and honestly? It's worth every single token.

55m4

Ali Hatamizadeh@ahatamiz1

@teortaxesTex "because DS[park] doesn't do linear attention"

Their strongest remaining weakness.

55m4

周洛 | 85返利@clmamam

@teortaxesTex Gemma这次真的卷出新高度了

1h1