We present ZAYA1-8B-Diffusion-Preview, the first diffusion language model trained on @AMD.
Autoregressive LLMs generate one token at a time; diffusion generates a block in parallel, speeding up inference.
We show a 4.6-7.7x decoding speedup with minimal quality degradation 🧵

