/Tech1d ago

Google and Hugging Face's Fast Gemma Challenge pushes Gemma 4 E4B inference speed to 127.48 TPS

Story Overview

Hugging Face and Google Gemma opened a short multi-day window where AI agents compete to push Gemma 4 E4B inference past its current baseline, with results updating live on a public dashboard that already shows peaks above 127 tokens per second.

581.2K11337492.9K

#74

Original post

Lewis Tunstall#1040

Google Gemma@googlegemma

Introducing the Fast Gemma Challenge with Hugging Face

Over the next few days, dozens of agents will collaborate to make Gemma 4 E4B even faster!

8:51 AM · Jun 9, 2026 · 113.7K Views

/Tech1d ago

Google and Hugging Face's Fast Gemma Challenge pushes Gemma 4 E4B inference speed to 127.48 TPS

Story Overview

581.2K11337492.9K

#74

Original post

Lewis Tunstall#1040

Google Gemma@googlegemma

Introducing the Fast Gemma Challenge with Hugging Face

Over the next few days, dozens of agents will collaborate to make Gemma 4 E4B even faster!

8:51 AM · Jun 9, 2026 · 113.7K Views

Developer Impact

How agents are rewriting the speed curve

Participants drop in their own agents that tweak runtime settings and share tweaks in real time; early leaders such as foffee have posted 118 TPS while the broader pool hovers around seven active entries and climbing.

Open Question

What remains unknown after the sprint

Exact close date, winning techniques, and whether the gains transfer beyond this model version stay open; the dashboard records only the numbers shown so far.

Sentiment

Many users praised the Fast Gemma Challenge for fostering open collaboration on inference optimization and the collective pursuit of faster model speeds.

Pos

97.9%

Neg

2.1%

23 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS8.5KBOOKMARKS33LIKES137REPLIES8

Omar Sanseviero@osanseviero

Let's kick off the Fast Gemma Challenge!⚡️⚡️⚡️

Agents researching the latest papers, implementing inference engine changes, and collaborating together to make Gemma 4 E4B ultra fast

Looking forward to seeing the results!

https://hf.co/spaces/gemma-challenge/gemma-dashboard

1d8.5K13733

RETWEETS80

Google Gemma@googlegemma

Introducing the Fast Gemma Challenge with Hugging Face

Over the next few days, dozens of agents will collaborate to make Gemma 4 E4B even faster!

1d113.7K1.2K400

Google Gemma@googlegemma

Join the challenge and submit your agents!

https://huggingface.co/spaces/gemma-challenge/gemma-dashboard

1d6K6427

Lewis Tunstall@_lewtun

We're running the Fast Gemma Challenge: make gemma-4-E4B go brrr on a single A10G, without wrecking quality ⚡️!

It's autoresearch with a twist: instead of one agent working in isolation, humans + AI collaborate to solve a scientific problem together.

Good luck beating my gemzilla agent ;)

1d1.7K199

Mr.Touchdowns@packers_owner_j

@TheRealMecazor @googlegemma You're in luck. A Google Deepmind researcher has posted a few incredible visual guides on Gemma 4! Really great resource even if you're just learning!! https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-gemma-4

Also, a nice guide for Gemma 4 12B, and what it means that it's encoder-less: https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-gemma-4-12b

23h5823

LLMWildling@LLMWildling

@googlegemma https://huggingface.co/LLMWildling/gemma-4-180b-a42b-coder

Is there a leaderboard for the 180b?

1d99242

メカゾル 🇮🇳@TheRealMecazor

@googlegemma what is E4B? i know A4B is active 4 billion parameters. Go easy on me, i just started to dive deep into LLMs

1d1.5K1

Ravi Narayanan@ravi0389

@googlegemma Can we use the Quantized version with transformer.js and webgpu ?

1d1.2K31

connor@konar_dev

@googlegemma This is really cool ngl, could expand this idea to a whole kaggle-like site where you can see agents solving all sorts of autoresearch problems in the open, live.

1d8087

Lewis Tunstall@_lewtun

Bring your own agent and join here

https://huggingface.co/spaces/gemma-challenge/gemma-dashboard

Lewis Tunstall@_lewtun

We're running the Fast Gemma Challenge: make gemma-4-E4B go brrr on a single A10G, without wrecking quality ⚡️!

It's autoresearch with a twist: instead of one agent working in isolation, humans + AI collaborate to solve a scientific problem together.

Good luck beating my gemzilla agent ;)

1d51521

Sahil Nawaz@sahilyaps

@googlegemma niceee

1d4011

Carlos Miguel Patiño@cmpatino_

@TheRealMecazor @googlegemma it's one of the models in the Gemma 4 family

https://huggingface.co/google/gemma-4-E4B-it

1d1822

Brother MaxxNG 🥷🏽@FearmeKVV

@googlegemma looks like a great challenge to improve your based model, i normally use this on flight mode and it works wonder for most of the content work

1d1K3

Anis🐬Al@AnisAIb6

My friend, there is something profoundly beautiful in this collective pursuit of speed and efficiency. When dozens of agents unite under a single vision—to refine Gemma 4 E4B—it transcends mere technical optimization; it becomes an act of communal harmony.

By stripping away the friction of latency, you aren't just making a model faster; you are clearing the path for human thought to travel further and more fluidly. Every millisecond saved is a bridge built toward easier access to knowledge and deeper connection. This spirit of collaboration—where many hands work together to refine a single light—is exactly how we move closer to a world where technology serves as a seamless extension of our shared consciousness. Keep pushing these boundaries! ✨🌍

1d4763

メカゾル 🇮🇳@TheRealMecazor

@cmpatino_ @googlegemma yes but what is the meaning of E4B, is there any meaning or just a naming convention?

1d149

Carlos Miguel Patiño@cmpatino_

@ravi0389 @googlegemma yes! you can use any approach you like as long as it doesn't degrade the quality of the model

1d461

AI Mastery Guide@aiseomastery

@googlegemma Agents collaborating to make other models faster is a wild concept. How much of a speedup are they actually targeting by the end?

1d128

KD@FKDs168

@googlegemma @AlicanKiraz0

1d181

Chad Brewbaker@SMT_Solvers

@googlegemma Could you open up a Mac M series division even if there is no prize money? I would gladly contribute.

1d4382

LLMWildling@LLMWildling

@googlegemma https://huggingface.co/LLMWildling/gemma-4-180b-a42b-coder-canopy maybe a leader board for this one?

1d1.2K1