/AI9h ago

ProgramBench Team Explains Initial 0% AI Model Scores on Benchmark

--0--
Original posts
Original post
Ofir Press@OfirPress#72inAI

Kilian (@KLieret) on why the initial 0% top scores on ProgramBench also surprised us

8:03 AM · Jun 1, 2026 · 3.1K Views
Sentiment
Sentiment unavailable for this story.
Cluster Engagement
-
Views
-
Comments
-
Reposts
-
Bookmarks
Expand data
Posts from X
Most Activity
Most ActivityTimeline
VIEWS1.9KBOOKMARKS4LIKES12RETWEETS3REPLIES2
Ofir Press@OfirPress

We're excited by ProgramBench not just because we think agents should be able to autonomously program well-specified full programs, but also because of the higher-level abilities that we think success on our benchmark necessitates:

9hViews 1.9KLikes 12Bookmarks 4