9h ago

Arkil Patel posts paper “Forecasting Downstream Performance of LLMs With Proxy Metrics” showing cross-entropy loss correlates below 0.5 with downstream results while proposed proxy metrics exceed 0.8

Proxy metrics rank pretraining datasets with fraction of usual compute.

0
Original post

Excited to share our new paper! “Forecasting Downstream Performance of LLMs With Proxy Metrics” w/ my amazing advisors @sivareddyg, @mariusmosbach, @DBahdanau Cross-entropy loss is a poor predictor of how models perform on downstream tasks (esp. reasoning). We propose something better: proxy metrics computed over expert reasoning traces. 🧵 Thread below 👇

6:44 AM · May 22, 2026 View on X

Nature is complex. Why would cross-entropy loss predict scaling behavior of language models on downstream task? Introducing data-driven proxy metrics for scaling laws. Proxy metrics are incredibly useful especially on tasks where models don't perform strongly yet.

Excellent work by @arkil_patel!

Arkil PatelArkil Patel@arkil_patel

Excited to share our new paper! “Forecasting Downstream Performance of LLMs With Proxy Metrics” w/ my amazing advisors @sivareddyg, @mariusmosbach, @DBahdanau Cross-entropy loss is a poor predictor of how models perform on downstream tasks (esp. reasoning). We propose something better: proxy metrics computed over expert reasoning traces. 🧵 Thread below 👇

1:44 PM · May 22, 2026 · 55.2K Views
9:43 PM · May 22, 2026 · 753 Views

To democratize AI, we need to help AI practitioners argue how investment can bring returns in the forms of superior intelligence. Forecasting downstream performance is super important! Check out @arkil_patel's work:

Arkil PatelArkil Patel@arkil_patel

Excited to share our new paper! “Forecasting Downstream Performance of LLMs With Proxy Metrics” w/ my amazing advisors @sivareddyg, @mariusmosbach, @DBahdanau Cross-entropy loss is a poor predictor of how models perform on downstream tasks (esp. reasoning). We propose something better: proxy metrics computed over expert reasoning traces. 🧵 Thread below 👇

1:44 PM · May 22, 2026 · 55.2K Views
2:14 PM · May 22, 2026 · 3.2K Views
Arkil Patel posts paper “Forecasting Downstream Performance of LLMs With Proxy Metrics” showing cross-entropy loss correlates below 0.5 with downstream results while proposed proxy metrics exceed 0.8 · Digg