1d ago

Building AI Evals Grows Harder as Agents Increase in Complexity

0
Original post

turns out that building evals is super super challenging even now. i thought a lot of it was table stakes but turns out it has only become harder since agents are now more complex than ever! going to start tweeting more about how i design evals, especially to create autonomous improvement loops!

3:11 PM · May 15, 2026 View on X
Reposted by
Building AI Evals Grows Harder as Agents Increase in Complexity · Digg