Copying → morphology/translation → basic arithmetic → complex reasoning & math. Across every model family we tested, LLMs acquire skills in roughly the same order during pretraining.
Can we use this to predict what a model will learn next, just from its internals? 🧵






