/AI8h ago

VaSE Reduces KV Cache Memory in Reasoning Models Without Training

54291046.2K

Original posts

Quote posts

Reposts

#405

Original post

Robin Jia#405

Deqing Fu@DeqingFu

Introducing VaSE: Value-Aware Stochastic KV Cache Eviction.

Reasoning models think in CoT, bloating the KV cache. Eviction caps memory but suffers capability drop. VaSE is a training-free recipe that cuts that cost: keep large-magnitude value states, evict stochastically.

5:21 PM · Jun 3, 2026 · 43.7K Views

/AI8h ago

VaSE Reduces KV Cache Memory in Reasoning Models Without Training

--0--

Original posts

Quote posts

Reposts

#405

Original post

Robin Jia#405

Deqing Fu@DeqingFu

Introducing VaSE: Value-Aware Stochastic KV Cache Eviction.

5:21 PM · Jun 3, 2026 · 43.7K Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Sentiment

Sentiment building, check back later.

Cluster Engagement

Views

Comments

Reposts

Bookmarks

Expand data

Posts from X

Most Activity

VIEWS1.4KBOOKMARKS2LIKES16RETWEETS3REPLIES1

Ting-Yun Chang@CharlotteTYC

This is the final project of my PhD journey 🎓 I've thought a lot about how to make interp actionable in my previous projects. I believe efficiency follows naturally: when we have a deep understanding of the model, we can figure out where to be frugal w/o hurting model accuracy. The Attention Sink and LLM.int8() papers set great examples, and they deeply inspire our paper. Mirroring the findings on value-state drain, we find that large-range value states are equally important in KV cache eviction. Evicting these outliers causes reasoning models to enter an endless self-reflection loop, while keeping them in the cache maintains accuracy. I'm extremely grateful to my amazing coauthors and supportive advisors.

Deqing Fu@DeqingFu

Introducing VaSE: Value-Aware Stochastic KV Cache Eviction.

4h1.4K162