Siddharth Joshi@sjoshi804
data curation really is the gift that keeps on giving
inference costs too high? curate your data
eval scores plateaued? curate your data
pre-training compute budget blown? curate your data
gym gains stalled? curate your data
Matthew Leavitt@leavittron
What if you could induce models to be more concise via pretraining data curation?
2:45 PM · Jun 25, 2026 · 1.1K Views