1d ago

Antirez releases quantized DeepSeek V4 Flash model on Hugging Face

417165824092.5K

——0——

Antirez published a quantized DeepSeek V4 Flash model on Hugging Face under the repository antirez/deepseek-v4-gguf. The 80.8 GiB file uses IQ2_XXS and Q2_K quantization on the routed experts along with Q8_0, F16, and F32 formats for other layers. The resulting model runs inference on a single RTX Pro 6000 GPU at a size comparable to gpt-oss-120B. Community observers view the release as a test of retained knowledge in the quantized variant.

Original post

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)#420@TEORTAXESTEX

Already reasonably established that it preserves a lot of general capability, interesting to test this on *knowledge* against gpt-oss-120B, as they're actually close in on-disk size.

7:07 AM · May 16, 2026

Cluster engagement

73 snapshots

Reposted by

#713@PMINERVINI

#28@_AKHALIQ

QUOTE POST

#420Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@TEORTAXESTEX

Already reasonably established that it preserves a lot of general capability, interesting to test this on *knowledge* against gpt-oss-120B, as they're actually close in on-disk size.

2:07 PM · May 16, 2026 · 7.2K Views