Day 0 Anthropic Fable 5 in ParseBench: We tested the model's advancements when it comes to document understanding. The model clearly peaks when it comes to adherence to the original text:
📃 Content faithfulness: 90.02% vs 86.19% (Gemini 3 Flash) and 86.81% (GPT-5.5) 🔢 Semantic formatting: 72.62% vs 58.35% and 60.12%, a 12+ point lead
These are two of the most important metrics for SOTA document understanding: does the output preserve what the document actually says, and does it preserve formatting that carries meaning?
But ... it's not a sweep there continues to be a lot of alpha in unlocking document understanding for frontier models.
Full results below 👇



