A crazy jump. The price of the tokens will be worth it to a vast number of enterprises.
Claude Fable 5 takes #1 on APEX-SWE: 65.5% Pass@1 overall. It scores ~18pp higher than Opus 4.8.
We tested @claudeai Fable 5 on APEX-SWE which measures whether AI models can do real software engineering work.
Fable 5 tops our two APEX-SWE categories: - Integration: 61.3% - Observability: 69.7%
The standout is Observability at 69.7%, 26pp ahead of Claude Opus 4.8. It is the first model to clear 50% on the category, and the only one that scores higher on Observability than on Integration. Every other model shows the reverse.
Observability has been the bottleneck for every model we have measured. Fable 5 is the first to break it.
Congrats to the @AnthropicAI team.














