/AI3h ago

Claude Fable-5 scores 20.0 on Eyebench-V3 vision benchmark, tying Qwen3.5-Flash and barely beating Claude Opus 4.7

Kalomaze blamed the results on a frozen vision encoder

9923135.9K
Original post

in vision, Claude Fable is on par with an *old 3B active Qwen* (Qwen-Flash is basically just hosted Qwen3.5-35B-A3B) that's all you get as a spillover from general scale

4:07 PM · Jun 9, 2026 · 4.3K Views
Sentiment

Users are frustrated with Claude Fable-5 tying a competitor on the Eyebench-V3 vision benchmark, complaining that Anthropic is not trying hard enough on vision while preferring other models.

Pos
0.0%
Neg
100.0%
3 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
LIKES14
kalomaze@kalomaze

@teortaxesTex God is it a frozen vision encoder or something GOD why is Google mogging them so hard on this

in vision, Claude Fable is on par with an *old 3B active Qwen* (Qwen-Flash is basically just hosted Qwen3.5-35B-A3B) that's all you get as a spillover from general scale

3hViews 523Likes 14Bookmarks 0
kalomaze@kalomaze

@teortaxesTex i trust gemma4 26b more for vision than sonnets

kalomaze@kalomaze

@teortaxesTex God is it a frozen vision encoder or something GOD why is Google mogging them so hard on this

3hViews 293Likes 9Bookmarks 1
Offset Zero@offsetx0

@teortaxesTex It is better in spatial reasoning. Vision is not their main property as of now, but I think this benchmark is heavily concentrated on a very narrow AI blindspot, which is more of a vision encoder benchmark than the model.

3hViews 8