9h ago

Arkil Patel posts paper “Forecasting Downstream Performance of LLMs With Proxy Metrics” showing cross-entropy loss correlates below 0.5 with downstream results while proposed proxy metrics exceed 0.8

Proxy metrics rank pretraining datasets with fraction of usual compute.

3123288858.2K

——0——

Original post

#967@LCHOSHENOP

Arkil Patel@ARKIL_PATEL

Excited to share our new paper! “Forecasting Downstream Performance of LLMs With Proxy Metrics” w/ my amazing advisors @sivareddyg, @mariusmosbach, @DBahdanau Cross-entropy loss is a poor predictor of how models perform on downstream tasks (esp. reasoning). We propose something better: proxy metrics computed over expert reasoning traces. 🧵 Thread below 👇

6:44 AM · May 22, 2026

QUOTE POST

#370Siva Reddy@SIVAREDDYG

Nature is complex. Why would cross-entropy loss predict scaling behavior of language models on downstream task? Introducing data-driven proxy metrics for scaling laws. Proxy metrics are incredibly useful especially on tasks where models don't perform strongly yet.

Excellent work by @arkil_patel!

Arkil Patel@arkil_patel

1:44 PM · May 22, 2026 · 55.2K Views

9:43 PM · May 22, 2026 · 753 Views

QUOTE POST

#387🇺🇦 Dzmitry Bahdanau@DBAHDANAU

To democratize AI, we need to help AI practitioners argue how investment can bring returns in the forms of superior intelligence. Forecasting downstream performance is super important! Check out @arkil_patel's work:

Arkil Patel@arkil_patel

1:44 PM · May 22, 2026 · 55.2K Views

2:14 PM · May 22, 2026 · 3.2K Views

Arkil Patel posts paper “Forecasting Downstream Performance of LLMs With Proxy Metrics” showing cross-entropy loss correlates below 0.5 with downstream results while proposed proxy metrics exceed 0.8

Cluster engagement

Sentiment