4d ago

Fast-Slow Training Lets LLMs Adapt Continually Without Forgetting Skills

0
Original post

Can LLMs adapt continually without losing base skills? Fast-Slow Training (FST) pairs "slow" weights with "fast" context. FST vs. RL: • 3x more sample-efficient • Higher performance ceiling • Less KL drift (better plasticity) • Continual learning: succeeds where RL stalls

8:37 AM · May 13, 2026 View on X
Reposted by

And see the tweet thread from @KushaSareen here

Kusha SareenKusha Sareen@KushaSareen

Can LLMs adapt continually without losing base skills? Fast-Slow Training (FST) pairs "slow" weights with "fast" context. FST vs. RL: • 3x more sample-efficient • Higher performance ceiling • Less KL drift (better plasticity) • Continual learning: succeeds where RL stalls

3:37 PM · May 13, 2026 · 113K Views
12:24 AM · May 15, 2026 · 4.7K Views
Fast-Slow Training Lets LLMs Adapt Continually Without Forgetting Skills · Digg