22h ago

Query Seeks Benchmarks for Agentic Debugging

——0——
Original post
Sarah CatanzaroSC#733@SARAHCAT21OPRichard ArtoulRARichard Artoul|@RICHARDARTOUL

is there a good benchmark for agentic debugging? seems like that would be super easy to build

12:03 PM · May 28, 2026 View on X
33101.1K

Cluster engagement

99 snapshots