21h ago

Learning Fast and Slow triples LLM sample efficiency

0

Learning Fast and Slow (FST) trains large language models by pairing slow reinforcement learning weight updates with fast prompt and context optimization on frozen base models. The method uses mechanisms such as GEPA for rapid in-context adaptation. It delivers three times greater sample efficiency than standard reinforcement learning, higher performance ceilings, reduced KL drift, and improved resistance to catastrophic forgetting while preserving plasticity for later tasks. The work references gain-tuning from the paper Adaptive denoising via GainTuning.

Original post

Nice! A related idea is “gain-tuning”, where channels are rescaled to rapidly (and reversibly) improve performance - https://www.cns.nyu.edu/~lcv/pubs/makeAbs.php?loc=Mohan21b

12:47 PM · May 16, 2026 View on X
Learning Fast and Slow triples LLM sample efficiency · Digg