/AI19h ago

Researchers Launch ProgramBench For Flexible Whole-Repository Code Generation

214184.8K

Original post

ProgramBench is the first whole-repository-generation benchmark that also allows agents to pick *which* language they're going to use and *how* they're going to implement the given program. w/ @jyangballin @KLieret @18jeffreyma

2:41 PM · Jun 5, 2026 · 2.4K Views

Sentiment

Positive users highlight agents choosing languages in ProgramBench for whole-repository code generation as interesting because implementation freedom reveals more about capabilities.

Pos

100.0%

Neg

0.0%

1 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS2.4KBOOKMARKS1LIKES1

Ofir Press@OfirPress

@jyangballin @KLieret @18jeffreyma Full ProgramBench Q&A: https://youtube.com/watch?v=blxN5jYWe8U Benchmark at https://programbench.com

Ofir Press@OfirPress

19h2.4K11

Jahanzaib Ahmed@jahanzaibai

@OfirPress @jyangballin @KLieret @18jeffreyma Letting agents pick the language is the interesting part. Implementation freedom probably reveals more about the agent's reasoning architecture than any fixed-language benchmark ever could.

17h221