/Tech19h ago

FlashMemory Cuts DeepSeek-V4 KV Cache 90% for 500K Context

63451413439.1K
Original post

This is pretty crazy ("Project status" under the abstract is also an insane detail). Further shrinking of V4 cache footprint to… 360 MB per 1M context? 360 *bytes* per token? Just 2 OOMs from the raw plaintext limit? Calling CSA «conventional» is crazy work lmao. @antirez !!

8:29 PM · Jun 9, 2026 · 36.7K Views
Sentiment

Users praise the candid reporting on FlashMemory's real limitations when cutting DeepSeek-V4 KV cache 90% for 500K context, calling it a big scientific win.

Pos
100.0%
Neg
0.0%
1 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS2.4KBOOKMARKS1LIKES37REPLIES1

Maybe this project's failure is a big scientific win though Can't remember the last time I've seen such candid reporting on real limitations, compromises and false hopes in the discussion

This is pretty crazy ("Project status" under the abstract is also an insane detail). Further shrinking of V4 cache footprint to… 360 MB per 1M context? 360 *bytes* per token? Just 2 OOMs from the raw plaintext limit? Calling CSA «conventional» is crazy work lmao. @antirez !!

19hViews 2.4KLikes 37Bookmarks 1