/AI13h ago

Florian Brand, Prime Intellect research engineer, adopts Gemma 4 E4B 6-bit quantized as his primary local Mac LLM

The new setup replaces his nine-month daily Qwen deployment.

407051941357.2K
Original post
Florian Brand@xeophon#1117inAI

Gemma 4 E4B 6bit is now the local model of my choice and loaded 24/7 on my Mac (using @lmstudio), replacing Qwen3, 3.5 4B after ~9 months of usage

What an insane model, congrats @GoogleDeepMind 🤠

4:19 AM · Jun 7, 2026 · 52.9K Views
Sentiment

Many users praise Gemma 4 as the preferred local Mac model for its efficiency, speed on limited hardware, and daily usefulness, while others question choosing smaller variants over larger ones.

Pos
72.7%
Neg
27.3%
11 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS4.8KBOOKMARKS8LIKES41
👩‍💻 Paige Bailey@DynamicWebPaige

💎 @googlegemma

Gemma 4 E4B 6bit is now the local model of my choice and loaded 24/7 on my Mac (using @lmstudio), replacing Qwen3, 3.5 4B after ~9 months of usage

What an insane model, congrats @GoogleDeepMind 🤠

11hViews 4.8KLikes 41Bookmarks 8
RETWEETS2

Gemma 4 E4B 6bit is now the local model of my choice and loaded 24/7 on my Mac (using @lmstudio), replacing Qwen3, 3.5 4B after ~9 months of usage

What an insane model, congrats @GoogleDeepMind 🤠

13hViews 52.9KLikes 670Bookmarks 406
REPLIES2
Lotto@LottoLabs

@xeophon @yacineMTB @lmstudio @GoogleDeepMind Wouldn’t qwen 9b be nicer?

9hViews 487Likes 8
Ianooo@maevorian

@xeophon @lmstudio @GoogleDeepMind Try the uncensored version, it's so much better imo

6hViews 168Bookmarks 1
Igor Kotenkov@stalkermustang

@xeophon @lmstudio @GoogleDeepMind what are ur usecases? "rewrite", "summarize", "translate," or something bigger in scope and harder by nature?

13hViews 274Likes 2
🧟@RaghavKoch19380

@xeophon @lmstudio @GoogleDeepMind Wouldn't the 4Bit QAT be better than a 6Bit PTQ

11hViews 889

@RaghavKoch19380 @lmstudio @GoogleDeepMind The QAT are GGUF only afaik

11hViews 821

@ignis_code @lmstudio @GoogleDeepMind M4 Max + 64 GB, model uses 7 GB

7hViews 63Likes 2
🧟@RaghavKoch19380

@xeophon @lmstudio @GoogleDeepMind There are compressed tensor versions or something available for vLLM etc i think. check their huggingface QAT folder.

11hViews 200Likes 1
Clemens Schartmüller@ClemensScharti

@xeophon @lmstudio @GoogleDeepMind what are you using it for?

10hViews 624

@0xgeorge @yacineMTB @lmstudio @GoogleDeepMind License

7hViews 170Likes 1
Jamison 🦆@jmelahman

@xeophon @lmstudio @GoogleDeepMind Is this also over Gemma 4 12B? https://blog.google/innovation-and-ai/technology/developers-tools/introducing-gemma-4-12B/

6hViews 134Likes 1
Vu@vu_zip

@xeophon @wambosec @lmstudio @GoogleDeepMind 64 gb and you use Gemma4 e4b ??? Bro at least use gemma4 12b

10hViews 276
IGNIS@ignis_code

@xeophon @lmstudio @GoogleDeepMind 어느정도의 VRAM을 사용하시나요?

7hViews 68Likes 1
Aaryan Kakad@aaryan_kakad

@xeophon @lmstudio @GoogleDeepMind yes, even i have one model always loaded on my system for assistance while building stuff or solving any problems.

i think people who can use small 4-9B models to build stuff can actually be called coders.

12hViews 57Likes 1
George I@0xgeorge

@xeophon @yacineMTB @lmstudio @GoogleDeepMind Why not LFM 2.5 at 8bit for just an extra gb?

7hViews 181
wambo.@wambosec

@xeophon @lmstudio @GoogleDeepMind mac specs?

13hViews 167

@xeophon @lmstudio @GoogleDeepMind Are you using it for the privacy considerations, Xeo?

12hViews 122
Dan Greller@dgreller

@xeophon @lmstudio @GoogleDeepMind What context window are you using?

13hViews 116
Lazarz@Laz4rz

@xeophon @lmstudio @GoogleDeepMind Why?

12hViews 88
Load more posts
Florian Brand, Prime Intellect research engineer, adopts Gemma 4 E4B 6-bit quantized as his primary local Mac LLM · Digg