Gemma 4 12B was a large team effort over more than a year. The model’s encoder-free tech was developed by @ASusanoPinto @AndreasPSteiner @confusezius @kmisiunas & myself with many contributions from @ashkamath20 @LawrenceSt72142 @OlivierBachem @armandjoulin & the whole Gemma Team
For the past years my research focus was on unifying models and training paradigms across modalities. Today I'm excited that we're releasing our latest model aligned with this theme:
Gemma 4 12B, a dense encoder-free model which processes raw text, image, and audio inputs!
1/