// Continual Learning Bench //
One of the research areas with lots of investments is continual learning.
While there are many efforts, there is very little progress in measuring it.
So the big question is, do dedicated memory systems actually make agents learn from experience?
Continual Learning Bench says not yet. Across six expert-validated domains with shared learnable structure, naive in-context learning outperforms systems purpose-built for memory management.
CL-Bench introduces a gain metric that isolates genuine learning from prior capability, then shows agents frequently overfit to immediate observations or fail to reuse knowledge across instances.
If a plain ICL baseline beats your memory architecture, the architecture is adding overhead rather than learning.
Paper: https://arxiv.org/abs/2606.05661
Learn to build effective AI agents in our academy: https://academy.dair.ai/















