Fable 5 scores 81.9% on SimpleBench the highest score almost reaching human baseline.
Fable 5 scores 81.9% on SimpleBench the highest score almost reaching human baseline.
Many users distrust the SimpleBench leaderboard for Claude Fable 5 because Gemini's high scores make the benchmark seem unreliable or flawed, while others believe the reported score is credible.

@JasonBotterill I knew it wouldn’t be too high but def higher than the rest

@JasonBotterill HAHAHAHAHA I SEE IT

AI Explained guy says the human baseline is 83.7% finally almost topped the bench after years

@47fucb4r8c69323 i always think he is wearing a kippah when i look at his profile picture

@JasonBotterill underrated benchmark btw

@JasonBotterill Source? There's no new AI explained video and the SimpleBench website has not been updated.

@JasonBotterill I'm not gonna lie I failed the bald dude shaving one and I felt pretty bad

@JasonBotterill Any bench that has Gemini 3.1 Pro beating GPT-5.5 is a bench I can't trust.

@JasonBotterill 2% over gemini 3.1 for a model that will purposefully mislead your research and lie to you, no thanks I'll stick with 3.1

@JasonBotterill unlike gemini i actually believe that score is about right

@JasonBotterill sure gemini 3.1 pro benches better (but is just worse all round). these benches feel more like stools

@JasonBotterill I wish it was a more lenient model.
I'm sure that if I were to recreate that...I'd get

The average human score was 83.7% and the highest-scoring of the 9 humans tested got 95.4%. (The 95.4% figure is inconsistent with section 4.2 of the simplebench report, which states each participant answered 25 questions.) So the LLMs are just below average human performance but far below the best humans.

@JasonBotterill You wanna verify if it was actually fable or fallback 4.8

@JasonBotterill @scaling01 GEMINI in second place 😂😂😂

@JasonBotterill why the f gemini is there

@47fucb4r8c69323 no one fucking believes me thank you😭😭😭

@JasonBotterill @R2Cdev_ i mean only 2 points above 3.1 Pro tho
Fable 5 scores 81.9% on SimpleBench the highest score almost reaching human baseline.