2h ago

Modal's Charles Frye argues the term 'KV cache' is misleading, calling the key-value distinction a mere implementation detail

Phil Chen argued the entire cache is an implementation detail.

β€”β€”0β€”β€”
Original post

can't believe we called it a KV cache when the "KV" part is clearly an implementation detail 😞

5:16 PM Β· May 25, 2026 View on X

@philhchen state/past caches seem pretty important for sequence models!

Phil ChenPhil Chen@philhchen

@charles_irl Hm well maybe the kv cache was actually the implementation detail

2:07 AM Β· May 26, 2026 Β· 501 Views
2:09 AM Β· May 26, 2026 Β· 463 Views

@philhchen you might also just, like, call it a _memory_

it's clearly some kind of associative memory, whether it's expressed via keys and values, or keys that are values, or just a big vector that's written/read across seqlen

Charles πŸŽ‰ FryeCharles πŸŽ‰ Frye@charles_irl

@philhchen state/past caches seem pretty important for sequence models!

2:09 AM Β· May 26, 2026 Β· 463 Views
2:11 AM Β· May 26, 2026 Β· 59 Views

@charles_irl Hm well maybe the kv cache was actually the implementation detail

Charles πŸŽ‰ FryeCharles πŸŽ‰ Frye@charles_irl

can't believe we called it a KV cache when the "KV" part is clearly an implementation detail 😞

12:16 AM Β· May 26, 2026 Β· 11.5K Views
2:07 AM Β· May 26, 2026 Β· 501 Views

@charles_irl tradeoff between specificity and overloading with SRAM / VRAM / VMEM / HBM memory too

Charles πŸŽ‰ FryeCharles πŸŽ‰ Frye@charles_irl

@philhchen you might also just, like, call it a _memory_ it's clearly some kind of associative memory, whether it's expressed via keys and values, or keys that are values, or just a big vector that's written/read across seqlen

2:11 AM Β· May 26, 2026 Β· 59 Views
2:29 AM Β· May 26, 2026 Β· 34 Views