/Tech7h ago

New EMBER Method Targets LLM Embeddings to Prevent Knowledge Relearning

05210276K

Original post

One of the biggest challenges in knowledge erasure in LLMs is that methods typically leave traces of the target knowledge, which allow recovering it easily through relearning

In a new work we show that such traces often can be found in model embeddings, which existing methods largely leave untouched

We find that removing these traces dramatically reduces susceptibility to relearning, while also improving erasure precision!

Check out @ClaraSuslik's thread for details. Paper and demo are out!

Clara Suslik@ClaraSuslik

New Preprint📢

Removing knowledge from LLMs is hard. Preventing models from relearning it is even harder.

In our new paper with @megamor2 and @OrShafran, we show that existing erasure methods have a blind spot: token embeddings.

The solution? EMBER🔥

🧵👇

7:59 AM · Jun 23, 2026 · 3.1K Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS6RETWEETS4

Yoav Gur Arieh@GurYoav

Lots of unlearning papers have focused on removing knowledge encoded in the model's MLP layers, but all have missed the knowledge encoded in the embeddings themselves!

@ClaraSuslik takes a really smart approach here that yields impressive improvements.

Clara Suslik@ClaraSuslik

New Preprint📢

Removing knowledge from LLMs is hard. Preventing models from relearning it is even harder.

In our new paper with @megamor2 and @OrShafran, we show that existing erasure methods have a blind spot: token embeddings.

The solution? EMBER🔥

🧵👇

8h2.9K2716