/Tech1d ago

Claude Fable-5 scores 20.0 on Eyebench-V3 vision benchmark, tying Qwen3.5-Flash and barely beating Claude Opus 4.7

Kalomaze blamed the results on a frozen vision encoder

1418752513.3K
Original post

in vision, Claude Fable is on par with an *old 3B active Qwen* (Qwen-Flash is basically just hosted Qwen3.5-35B-A3B) that's all you get as a spillover from general scale

adi@adonis_singh

added claude-fable-5 to eyebench-v3!

fable is a 4% improvement over opus 4.7, overall not that impressive. anthropic is fully agentmaxxing

4:07 PM · Jun 9, 2026 · 10.2K Views
Sentiment

Users are frustrated with Claude Fable-5 tying a competitor on the Eyebench-V3 vision benchmark, complaining that Anthropic is not trying hard enough on vision while preferring other models.

Pos
0.0%
Neg
100.0%
3 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
kalomaze@kalomaze

@teortaxesTex God is it a frozen vision encoder or something GOD why is Google mogging them so hard on this

in vision, Claude Fable is on par with an *old 3B active Qwen* (Qwen-Flash is basically just hosted Qwen3.5-35B-A3B) that's all you get as a spillover from general scale

1dViews 1.3KLikes 23Bookmarks 0
kalomaze@kalomaze

@teortaxesTex i trust gemma4 26b more for vision than sonnets

kalomaze@kalomaze

@teortaxesTex God is it a frozen vision encoder or something GOD why is Google mogging them so hard on this

1dViews 426Likes 11Bookmarks 1
Offset Zero@offsetx0

@teortaxesTex It is better in spatial reasoning. Vision is not their main property as of now, but I think this benchmark is heavily concentrated on a very narrow AI blindspot, which is more of a vision encoder benchmark than the model.

1dViews 8