// Agents' Last Exam //
Agents' Last Exam is a living benchmark of over 1,000 economically valuable tasks, built with 250+ industry experts and mapped to the U.S. federal occupational taxonomy.
The hardest tier sits at a 2.6% average full pass rate across mainstream harnesses and backbones.
ALE behaves like a GDP-coverage instrument instead of another test that saturates in a month.
Paper: https://arxiv.org/abs/2606.05405
Learn to build effective AI agents in our academy: https://academy.dair.ai/





