🎉 SGLang has Day-0 support for DiffusionGemma, a text-diffusion variant of @googlegemma 's Gemma 4 (26B A4B MoE), built for blazing low-batch generation speed!
Instead of token-by-token decoding, it denoises blocks of tokens in parallel for much faster generation. 1️⃣ Discrete text diffusion: block-parallel multi-canvas sampling 2️⃣ Multimodal: text, image & video in, text out 3️⃣ Sparse MoE (8 of 128 experts): strong reasoning, low memory 4️⃣ Configurable thinking mode
Run it now with SGLang!
Meet DiffusionGemma!
An experimental open model that explores a fast approach to text generation, released under an Apache 2.0 license.
Moving beyond sequential, token-by-token processes to generate entire blocks of text simultaneously. Here’s what’s new with DiffusionGemma: 👇

