/Tech6h ago

Claude Fable 5 Tops SimpleBench Leaderboard With 81.9% Score

1829992111.9K
Original postLisan al Gaib#1064
JB@JasonBotterill

Fable 5 scores 81.9% on SimpleBench the highest score almost reaching human baseline.

6:23 AM · Jun 10, 2026 · 11.9K Views
Sentiment

Positive users value SimpleBench for credibly validating Claude Fable 5's 81.9% score unlike Gemini, while negative users dismiss the benchmark as untrustworthy because Gemini ranks high and the lead is small.

Pos
37.5%
Neg
62.5%
8 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS635LIKES5
Chris@ChrissGPT

@JasonBotterill I knew it wouldn’t be too high but def higher than the rest

5hViews 635Likes 5
JB@JasonBotterill

AI Explained guy says the human baseline is 83.7% finally almost topped the bench after years

6hViews 126Likes 2
JB@JasonBotterill

@47fucb4r8c69323 i always think he is wearing a kippah when i look at his profile picture

6hViews 48Likes 2
Vlad G.@vladg_tw

@JasonBotterill Source? There's no new AI explained video and the SimpleBench website has not been updated.

5hViews 227Likes 4
Fedesco@Fedesco5

@JasonBotterill Any bench that has Gemini 3.1 Pro beating GPT-5.5 is a bench I can't trust.

6hViews 259Likes 1
Karl 📚🧮@karlbooklover

@JasonBotterill 2% over gemini 3.1 for a model that will purposefully mislead your research and lie to you, no thanks I'll stick with 3.1

4hViews 104Likes 1
ρ:ɡeσn@pigeon__s

@JasonBotterill unlike gemini i actually believe that score is about right

4hViews 159
checked_out@checkfoc_us

@JasonBotterill sure gemini 3.1 pro benches better (but is just worse all round). these benches feel more like stools

6hViews 132
Qwub@Qwubos

@JasonBotterill I wish it was a more lenient model.

I'm sure that if I were to recreate that...I'd get

5hViews 121
Delta Vee@deltaVee42

The average human score was 83.7% and the highest-scoring of the 9 humans tested got 95.4%. (The 95.4% figure is inconsistent with section 4.2 of the simplebench report, which states each participant answered 25 questions.) So the LLMs are just below average human performance but far below the best humans.

4hViews 71
zuphr1n@zuphr1n

@JasonBotterill You wanna verify if it was actually fable or fallback 4.8

5hViews 60
cjekrjgrw@rzastyyy

@JasonBotterill @scaling01 GEMINI in second place 😂😂😂

5hViews 58
JB@JasonBotterill

@47fucb4r8c69323 no one fucking believes me thank you😭😭😭

6hViews 11Likes 1
Hamza@thegenioo

@JasonBotterill @R2Cdev_ i mean only 2 points above 3.1 Pro tho

4hViews 16