In film, "we'll fix it in post" is what you say when something went wrong on set and you don't want to redo it. AI research has made it our entire methodology: train the model, then patch whatever comes out. Our new ICML oral argues this can't be the basis of a science of AI. 🧵
EleutherAI's Stella Biderman argues AI research must shift from post-training patching to studying training dynamics
ICML accepted the position paper as an oral presentation.
Many users praised the ICML paper's framing urging focus on training dynamics over post-training fixes, while some objected that it undervalues key insights from post-hoc work like scaling laws and RLHF.
Most Activity
@BlancheMinerva I love this framing :)
In film, "we'll fix it in post" is what you say when something went wrong on set and you don't want to redo it. AI research has made it our entire methodology: train the model, then patch whatever comes out. Our new ICML oral argues this can't be the basis of a science of AI. 🧵
Also, appolgies to @learning_mech and the "There Will Be a Scientific Theory of Deep Learning" team for not engaging with the contents of your paper. I believe I learned about it on the same day we got the ICML acceptance notifications.
In film, "we'll fix it in post" is what you say when something went wrong on set and you don't want to redo it. AI research has made it our entire methodology: train the model, then patch whatever comes out. Our new ICML oral argues this can't be the basis of a science of AI. 🧵

Read the full paper: https://arxiv.org/abs/2606.06533 or come listen to our oral @icmlconf!
Huge thanks to my co-authors @aflah02101 @niloofar_mire @linguist_cat @FazlBarez @nsaphra
Stay tuned for a related workshop (hopefully) at NeurIPS too!

Models are not static objects. They're snapshots of time-evolving processes shaped by data, objectives, architectures, and optimization. But most research treats them as fixed artifacts, analyzing behaviors after training instead of asking why they emerged.

We ground discussion in the history and philosophy of science. What did it take for other fields to move from cataloging phenomena to predicting and controlling them? AI can learn from that playbook.

Part of why post hoc analysis dominates: it's the only thing most researchers CAN do. Almost no one releases intermediate checkpoints or training data. we built MultiBERT and Pythia to set a better standard, and it's been great to see work like OLMo and Marin follow our lead.

Post hoc analysis can certainly be useful, especially if you’re primarily concerned with the behavior of a specific deployed model. But looking at a static model will not tell you why the model developed a behavior. The real causal story must go back to the training process.

A common issue with position papers is that they leave the reader wondering “okay, but what should I actually do”? To address this we provide open problems on a wide variety of topics throughout to illustrate our perspectives and guide future research
From what I've skimmed I think we're in agreement about a lot of things, but I'm excited to find time to read it closely :)
Also, appolgies to @learning_mech and the "There Will Be a Scientific Theory of Deep Learning" team for not engaging with the contents of your paper. I believe I learned about it on the same day we got the ICML acceptance notifications.

A test for progress: a science of AI should support progressively stronger forms of understanding: 1. Predict outcomes from early training signals 2. Intervene to correct trajectories on undesirable paths 3. Design training procedures that reliably produce desired properties

honestly the framing underrates how much we learned from fixing in post. scaling laws, grokking, induction heads, even rlhf all came from post-hoc analysis of already-trained models, not from planned instrumentation during training runs. you can only measure what you thought to look for, and the good surprises are usually the ones you didnt plan

@BlancheMinerva @icmlconf @Aflah02101 @niloofar_mire @linguist_cat @FazlBarez @nsaphra Going to share this with the team!👏

@guilhermeotina Yes, if we said that we would be very silly. But that's not what we're talking about. Scaling laws, grokking, and induction heads are some of the best examples of the kind of work we are advocating for.