/AI8h ago

SWE-bench co-creator Ofir Press shares data showing AI agents scale from 34% to 75% accuracy over 1,000 steps

Performance gains scale logarithmically over extended step sequences.

--0--
Original posts
Comments
Original post
Ofir Press@OfirPress#72inAI

it's *SWE-bench* guys, please and thank you

1:09 PM 路 Jun 2, 2026 路 974 Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
-
Views
-
Comments
-
Reposts
-
Bookmarks
Expand data
Posts from X
Most Activity
Most ActivityTimeline
VIEWS429BOOKMARKS1LIKES3REPLIES1

@OfirPress 馃ゴ letting the team know!

Ofir Press@OfirPress

it's *SWE-bench* guys, please and thank you

7hViews 429Likes 3Bookmarks 1