Baseten Demonstrates MLP Compression Of LLM KV Caches For Long Contexts · Digg