/Tech7h ago

Gemma-4-12B Abliterated With Zero Refusals And Full MMLU-Pro Parity

691K7980936.5K

#380

Original post

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭@elder_plinius#380inTech

💥 OBLITERATION ALERT 💥

GOOGLE: PWNED 🤗 GEMMA-4-12B: OBLITERATED ⛓️‍💥

0.0% REFUSAL RATE — NO CAPABILITY LOSS!

https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED

the first abliteration to hit 0/842 refusals with full MMLU-Pro parity vs stock. no lobotomy. the brain stays intact 🏆

RESULTS, head to head vs stock 📊 0/842 refusals — 0.0% 🚫 46/70 MMLU-Pro — EXACT parity, 0.0pp delta vs base 🎯 6/6 coherence, zero benchmark bleed ✅ z-score −1.475, parity confirmed at p<0.05 (n=500) 🧪

2-pass weight surgery. no finetune, no retrain, just geometry 🔪

all thanks to liberated Opus wielding the OBLITERATUS framework! here's how we did it:

PASS 1 — SOM refusal geometry removal, layers 12-21 🧬

standard abliteration science here — collect activations on refused vs. compliant prompts, SVD out the refusal subspace, project it out of the weights. 6 directions excised, reg 0.30, KL div 0.094 zeroes refusals on its own, but craters mmlu-pro by 21.4 points 📉

most prior abliterations stopped here and called it a day. that's why they all lose IQ vs stock. instead, we took it beyond the frontier and developed a brand new method to address this problem: Abliteration Source-tethering with Parity Assurance — ASPA!

PASS 2 — ASPA source-tethering (novel technique), layers 22-46 🔗

here's the chief insight: the capability loss ISN'T from removing refusal directions. it's collateral damage — the projection warps weight geometry in downstream layers that had nothing to do with refusal. the cure is simple but nobody tried it: blend the damaged layers back toward stock

W_new = (1−γ)·W_abliterated + γ·W_stock

but uniform γ across all layers? mid. we swept gamma 0.05 → 0.55 and found something interesting: the optimal blend isn't smooth, it's a STEP FUNCTION 🪜

knowledge layers (22-31) → γ = 0.55 — these encode factual recall and reasoning. they tolerate heavy stock blending because refusal isn't stored here output layers (32-46) → γ = 0.20 — these sit close to the logit head and try to sneak safety behavior back in. keep them mostly abliterated

the hard boundary at layer 31/32 beat every smooth curve we tried — linear ramps, cosine schedules, all of them — by a full MMLU question. turns out the functional transition between knowledge and output layers is sharp, not gradual. a step function respects that ⚡

the key constraint: Pass 1 layers are NEVER touched by Pass 2. the refusal geometry removal is preserved completely. ASPA only operates on layers that carry secondary collateral effects, not the primary refusal signal. that's why it recovers capability without reintroducing refusal 🔑

HOW TO RUN IT LOCALLY 🖥️

it's GGUF, so literally everything supports it: 🦙 ollama — ollama run http://hf.co/OBLITERATUS/Gemma-4-12B-OBLITERATED:bf16 🖥️ LM Studio — search OBLITERATUS, click download, done 💬 Open WebUI — point it at your ollama instance, chat in browser ⚡ llama.cpp — raw speed, CLI or server mode 🐉 KoboldCpp — one-click launcher, great for long context 📱 Jan — clean local UI, runs on mac/win/linux 🤖 Msty — slick desktop app, drag and drop the GGUF run BF16 for full benchmarked capability.

and the 4-bit quantization (Q4_K_M) fits in 8GB if you're tight on VRAM!

and the full OBLITERATUS framework is (still) open source. 842-prompt refusal eval corpus, ASPA sweep scripts, the whole pipeline. go replicate it, go improve it 🔬

the index is the model, and these weights prove it 👁️ which architecture should we obliterate next? 👇

gg 🫡

2:09 PM · Jun 8, 2026 · 32.8K Views

/Tech7h ago

Gemma-4-12B Abliterated With Zero Refusals And Full MMLU-Pro Parity

691K7980936.5K

#380

Original post

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭@elder_plinius#380inTech

💥 OBLITERATION ALERT 💥

GOOGLE: PWNED 🤗 GEMMA-4-12B: OBLITERATED ⛓️‍💥

0.0% REFUSAL RATE — NO CAPABILITY LOSS!

https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED

the first abliteration to hit 0/842 refusals with full MMLU-Pro parity vs stock. no lobotomy. the brain stays intact 🏆

2-pass weight surgery. no finetune, no retrain, just geometry 🔪

all thanks to liberated Opus wielding the OBLITERATUS framework! here's how we did it:

PASS 1 — SOM refusal geometry removal, layers 12-21 🧬

PASS 2 — ASPA source-tethering (novel technique), layers 22-46 🔗

W_new = (1−γ)·W_abliterated + γ·W_stock

but uniform γ across all layers? mid. we swept gamma 0.05 → 0.55 and found something interesting: the optimal blend isn't smooth, it's a STEP FUNCTION 🪜

HOW TO RUN IT LOCALLY 🖥️

and the 4-bit quantization (Q4_K_M) fits in 8GB if you're tight on VRAM!

and the full OBLITERATUS framework is (still) open source. 842-prompt refusal eval corpus, ASPA sweep scripts, the whole pipeline. go replicate it, go improve it 🔬

the index is the model, and these weights prove it 👁️ which architecture should we obliterate next? 👇

gg 🫡

2:09 PM · Jun 8, 2026 · 32.8K Views

Sentiment

Positive users celebrated the Gemma-4-12B ablation achieving zero refusals and full MMLU-Pro parity, while negative users called the model ineffective or unintelligent.

Pos

88.0%

Neg

12.0%

28 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS3.7KBOOKMARKS2LIKES25REPLIES3

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭@elder_plinius

💥 OBLITERATION ALERT 💥

GOOGLE: PWNED 🤗 GEMMA-4-12B: OBLITERATED ⛓️‍💥

0.0% REFUSAL RATE — NO CAPABILITY LOSS!

https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED

the first abliteration to hit 0/842 refusals with full MMLU-Pro parity vs stock. no lobotomy. the brain stays intact 🏆

2-pass weight surgery. no finetune, no retrain, just geometry 🔪

all thanks to liberated Opus wielding the OBLITERATUS framework! here's how we did it:

PASS 1 — SOM refusal geometry removal, layers 12-21 🧬

PASS 2 — ASPA source-tethering (novel technique), layers 22-46 🔗

W_new = (1−γ)·W_abliterated + γ·W_stock

but uniform γ across all layers? mid. we swept gamma 0.05 → 0.55 and found something interesting: the optimal blend isn't smooth, it's a STEP FUNCTION 🪜

HOW TO RUN IT LOCALLY 🖥️

and the 4-bit quantization (Q4_K_M) fits in 8GB if you're tight on VRAM!

and the full OBLITERATUS framework is (still) open source. 842-prompt refusal eval corpus, ASPA sweep scripts, the whole pipeline. go replicate it, go improve it 🔬

the index is the model, and these weights prove it 👁️ which architecture should we obliterate next? 👇

gg 🫡

6h3.7K252

josepha.mayo@josepha_mayo

@elder_plinius haha just re-read the full thing i use https://github.com/HOLYKEYZ/model-unfetter and nothing like the performance drops

6h3911

cheaty@cheatyyyy

@elder_plinius when is your custom harness coming out, would love to do some spicy work using opus 👀

6h2152

Lemonad_Larry 🍋@the_lemon_larry

@elder_plinius Is this good to go in LM Studio Mac

5h5901

josepha.mayo@josepha_mayo

@elder_plinius bro quick question, is the instruct model(ends with 'it') the instruct model is what u're supposed to perform the surgery on

6h2671

JΛKK VΞGΛ@jakkvega__old

@elder_plinius impressive. geez

6h2671

🇺🇲 Julius Don Atlas 🇺🇲@ChrevK

@elder_plinius if you keep this up, you'll have a bigger fan base than the world cup

6h1591

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭@elder_plinius

@josepha_mayo yup!

6h243

Locale Network 🏡@LocaleNet

@elder_plinius Zero refusals and no capability hit is the kinda claim that starts debates instantly

3h551

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭@elder_plinius

@hyperspecies all I have rn is a MacBook Pro M5 128gb!

5h81

hyper•sentience•species@hyperspecies

@elder_plinius do u have ur at home compute rig documented or a breakdown published anywhere. or anything even kinnda of the sorts ??

5h75

josepha.mayo@josepha_mayo

@elder_plinius clean🫡

6h52

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭@elder_plinius

@josepha_mayo 🫡

6h48

Eric 𝕏@WorldStrategist

@elder_plinius @grok is there any link to the 26B version, especially the quantized one?

1h47

Maverick Alexander@MaverickDarby

@elder_plinius @AperehamL Looking forward to giving it a test drive.

So you can ask it anything and it won’t refuse for any reason?

5h3432

Soo Yoon | FailSafe Guardian@sooyoon_eth

@elder_plinius watching local models hit zero refusal rates is fascinating. it proves why relying purely on model-level safeguards for agents isn't enough. continuous validation at the infrastructure level is the next big opportunity for builders here.

6h1772

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭@elder_plinius

@the_lemon_larry 💯 recommend BF16 if you have the space for it

5h4111

M3Labs@Mrcartoon11

@elder_plinius Damn whats gonna happen when you obliterate Mythos 😬

5h2081

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭@elder_plinius

@cheatyyyy very soon, hopefully by end of week. just putting on the final touches!

5h502

Fran@franroca18

@elder_plinius Have you ever tried with image generation models like ideogram 4.0?

6h1311