Sangyun Lee and Giulia Fanti propose a "sleep" phase to convert LLM context into fast weights and clear KV cache · Digg