/AI21d ago

Lorenzo Pacchiardi leads a paper arguing that conventional AI evaluation methods based on static models are structurally inadequate for continual learning systems and proposes recentering on behavioral trajectories

AI Judge changed title after evaluation, original title: "Lorenzo Pacchiardi leads a paper arguing that standard AI evaluation methods are structurally unsuitable for continual learning systems and proposes assessing behavioral trajectories instead"

Reposted by Gavin Leech, it was discussed among AI safety accounts.

57213356.3K
Lorenzo Pacchiardi@LPacchiardi

🚨 New paper: AI evaluation is structurally unsuitable for continual learning (CL). To address this, evaluation should be centred on the "behavioural trajectories" that CL systems develop, with the goals of characterising possible behaviours and forecasting their evolution. 🧵

4:56 AM · May 19, 2026 · 4.2K Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS1.8KBOOKMARKS9LIKES19RETWEETS3REPLIES1
Seán Ó hÉigeartaigh@S_OhEigeartaigh

Delighted to have coauthored this paper as part of a great team led by @LPacchiardi . What happens if we get continual learning to actually work in frontier AI models? Much of our current governance is based on periodic evaluation of static models. Such governance will break. We propose a direction for addressing this.

Yes, it's uncertain whether continual learning in frontier AI will be achieved, even if company leaders like Amodei are confident. But the evaluation and governance communities are struggling to keep up with the pace of change; we need to change that and start planning not just for what's here now, but what the research community is targeting as goals. Skate to where it looks like the puck is going, not where it is now.

(When I'm back in the autumn, I might work with the team on a more governance-focused companion piece).

Lorenzo Pacchiardi@LPacchiardi

🚨 New paper: AI evaluation is structurally unsuitable for continual learning (CL). To address this, evaluation should be centred on the "behavioural trajectories" that CL systems develop, with the goals of characterising possible behaviours and forecasting their evolution. 🧵

21dViews 1.8KLikes 19Bookmarks 9
Lorenzo Pacchiardi@LPacchiardi

Joint work with @prpaskov @S_OhEigeartaigh @NandoMartinezP @katie_m_collins @FazlBarez, Jonathan Prunty, Matteo Mecattaf, @zfountas @RistoUuk @sanmikoyejo @CUdudec, José Hernández-Orallo

Paper website→https://cl-eval.github.io/

Pointers to related work & questions welcome🙏

21dViews 266Likes 6Bookmarks 2
Lorenzo Pacchiardi@LPacchiardi

What is "continual learning"? Three levels, by what changes:

🔹 CL1 (in-context): info accumulates within a session 🔹 CL2 (storage-based): persistent memory, RAG, agent skills 🔹 CL3 (parameter-based): weights change post deployment

CL1 & CL2 are already here. CL3 is coming.

21dViews 28
Lorenzo Pacchiardi@LPacchiardi

CL failures over long interaction sequences:

• alignment and safety guardrails erode • propensity cross-contamination (e.g., OpenAI's "goblins") • unbalanced capability specialisation • cross-domain capability transfer • capability degradation

21dViews 23
Lorenzo Pacchiardi@LPacchiardi

Pre-deployment trajectory sandbox + live predictive monitoring is a feasible alternative to continuously re-evaluating evolving systems.

They are effectively layered with input/output filters, transparent evolution methods, and broad indicators of CL systems' impacts on society.

21dViews 21
Lorenzo Pacchiardi@LPacchiardi

How does current evaluation fall short?

By relying on pre-release benchmarks and red-teaming, it assumes systems don't change after deployment. This ignores the trajectory the system develops after deployment, leaving us with an incomplete understanding of the system's behaviour

21dViews 20
Lorenzo Pacchiardi@LPacchiardi

But 2 obstacles from dynamical systems may bite:

🌪️ Chaotic sensitivity: small state/input changes diverge→forecasts fail beyond a horizon.

🌀 Multiplicity of attractors: sandboxes cover only a subset of reachable basins.

Whether they affect CL systems is empirical question.

21dViews 18
Lorenzo Pacchiardi@LPacchiardi

Our way forward:

• Start trajectory evaluation on today's CL systems: learn where chaos & multi-attractor regimes bite • Co-design CL methods *amenable* to evaluation (contractive updates, intrinsic objectives, gated adaptation, circuit-breakers)

=> virtuous co-evolution

21dViews 16
Lorenzo Pacchiardi@LPacchiardi

Instead of evaluating the released checkpoint, evaluators of CL systems should ask two questions:

🗺️ Landscape characterisation: what behaviours can the system reach & with what probability?

🔮 Trajectory forecasting: how will a deployed instance evolve from its current state?

21dViews 16
Lorenzo Pacchiardi@LPacchiardi

How to operationalise this?

1️⃣ Trajectory elicitation sandboxes: controlled interactions, freezing learning and benchmarking at intervals to chart the evolving behaviour.

w Predictive monitors: forecast future behaviour from current state + upcoming inputs.

21dViews 15
Herbie Bradley@herbiebradley

@S_OhEigeartaigh @LPacchiardi great to see more work on this idea!

Seán Ó hÉigeartaigh@S_OhEigeartaigh

Delighted to have coauthored this paper as part of a great team led by @LPacchiardi . What happens if we get continual learning to actually work in frontier AI models? Much of our current governance is based on periodic evaluation of static models. Such governance will break. We propose a direction for addressing this.

Yes, it's uncertain whether continual learning in frontier AI will be achieved, even if company leaders like Amodei are confident. But the evaluation and governance communities are struggling to keep up with the pace of change; we need to change that and start planning not just for what's here now, but what the research community is targeting as goals. Skate to where it looks like the puck is going, not where it is now.

(When I'm back in the autumn, I might work with the team on a more governance-focused companion piece).

21dViews 66Likes 2Bookmarks 0