Marin releases Delphi scaling suite for pretraining predictions
Marin introduced its Delphi scaling suite to support reliable performance predictions during pretraining of open models. Researchers trained multiple small Dyna models from 72 million to 6.9 billion parameters under one fixed recipe and fitted a scaling law. The law extrapolated accurately to a 25-billion-parameter model on 600 billion tokens at roughly 1e23 FLOPs, matching observed Paloma macro loss within 0.2 percent error and confirming the fit across more than two orders of magnitude in compute.
Really incredible work!
To train better open models, we need predictable scaling. Delphi is Marin’s first step: we pretrained many small models with one recipe, then extrapolated 300× to predict a 25B-param / 600B-token run with just 0.2% error. Getting there took some work 🧵
Marin’s Delphi scaling suite is out!
With the right scaling recipe, small runs predicted a 1e23 FLOP run within 0.2%, extrapolating 300× past the largest run in the fit.
To train better open models, we need predictable scaling. Delphi is Marin’s first step: we pretrained many small models with one recipe, then extrapolated 300× to predict a 25B-param / 600B-token run with just 0.2% error. Getting there took some work 🧵
Delphi changes how we evaluate new ideas: start small, sweep the param/token tradeoff, scale the key hypers, compare against forecasts, repeat.
Marin’s Delphi scaling suite is out! With the right scaling recipe, small runs predicted a 1e23 FLOP run within 0.2%, extrapolating 300× past the largest run in the fit.
Also, Will is underselling the blog post. The interactive figures are excellent: they make the scaling intuition concrete, including what transfers and what breaks.
Worth reading https://openathena.ai/blog/delphi/
Delphi changes how we evaluate new ideas: start small, sweep the param/token tradeoff, scale the key hypers, compare against forecasts, repeat.
scaling laws are beautiful and this is the best resource to understand them

To train better open models, we need predictable scaling. Delphi is Marin’s first step: we pretrained many small models with one recipe, then extrapolated 300× to predict a 25B-param / 600B-token run with just 0.2% error. Getting there took some work 🧵
To train better open models, we need predictable scaling.
Delphi is Marin’s first step: we pretrained many small models with one recipe, then extrapolated 300× to predict a 25B-param / 600B-token run with just 0.2% error.
Getting there took some work 🧵