/Tech1d ago

Claude Fable-5 scores 20.0 on Eyebench-V3 vision benchmark, tying Qwen3.5-Flash and barely beating Claude Opus 4.7

Kalomaze blamed the results on a frozen vision encoder

1418752513.3K

#440

Original post

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex#440inTech

in vision, Claude Fable is on par with an *old 3B active Qwen* (Qwen-Flash is basically just hosted Qwen3.5-35B-A3B) that's all you get as a spillover from general scale

adi@adonis_singh

added claude-fable-5 to eyebench-v3!

fable is a 4% improvement over opus 4.7, overall not that impressive. anthropic is fully agentmaxxing

4:07 PM · Jun 9, 2026 · 10.2K Views

/Tech1d ago

Claude Fable-5 scores 20.0 on Eyebench-V3 vision benchmark, tying Qwen3.5-Flash and barely beating Claude Opus 4.7

Kalomaze blamed the results on a frozen vision encoder

1418752513.3K

#440

Original post

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex#440inTech

in vision, Claude Fable is on par with an *old 3B active Qwen* (Qwen-Flash is basically just hosted Qwen3.5-35B-A3B) that's all you get as a spillover from general scale

adi@adonis_singh

added claude-fable-5 to eyebench-v3!

fable is a 4% improvement over opus 4.7, overall not that impressive. anthropic is fully agentmaxxing

4:07 PM · Jun 9, 2026 · 10.2K Views

Sentiment

Users are frustrated with Claude Fable-5 tying a competitor on the Eyebench-V3 vision benchmark, complaining that Anthropic is not trying hard enough on vision while preferring other models.

Pos

0.0%

Neg

100.0%

3 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS1.4KBOOKMARKS2LIKES23REPLIES2

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

yeah just one benchmark, I'm exaggerating but this is directionally true. They're not even trying

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

in vision, Claude Fable is on par with an *old 3B active Qwen* (Qwen-Flash is basically just hosted Qwen3.5-35B-A3B) that's all you get as a spillover from general scale

1d1.4K232

kalomaze@kalomaze

@teortaxesTex God is it a frozen vision encoder or something GOD why is Google mogging them so hard on this

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

in vision, Claude Fable is on par with an *old 3B active Qwen* (Qwen-Flash is basically just hosted Qwen3.5-35B-A3B) that's all you get as a spillover from general scale

1d1.3K230

kalomaze@kalomaze

@teortaxesTex i trust gemma4 26b more for vision than sonnets

kalomaze@kalomaze

@teortaxesTex God is it a frozen vision encoder or something GOD why is Google mogging them so hard on this

1d426111

Offset Zero@offsetx0

@teortaxesTex It is better in spatial reasoning. Vision is not their main property as of now, but I think this benchmark is heavily concentrated on a very narrow AI blindspot, which is more of a vision encoder benchmark than the model.

1d8