Kilian (@KLieret) on why the initial 0% top scores on ProgramBench also surprised us
8:03 AM · Jun 1, 2026 · 3.1K Views
Kilian (@KLieret) on why the initial 0% top scores on ProgramBench also surprised us
We're excited by ProgramBench not just because we think agents should be able to autonomously program well-specified full programs, but also because of the higher-level abilities that we think success on our benchmark necessitates:
Kilian (@KLieret) on why the initial 0% top scores on ProgramBench also surprised us