in vision, Claude Fable is on par with an *old 3B active Qwen* (Qwen-Flash is basically just hosted Qwen3.5-35B-A3B) that's all you get as a spillover from general scale
The score beats Claude Opus 4.7 by four points.
in vision, Claude Fable is on par with an *old 3B active Qwen* (Qwen-Flash is basically just hosted Qwen3.5-35B-A3B) that's all you get as a spillover from general scale
Many users dismissed Claude Fable-5's tie on the Eyebench-V3 vision benchmark as unimpressive, criticizing its vision encoder and expressing frustration that Google and Gemma outperform it.
yeah just one benchmark, I'm exaggerating but this is directionally true. They're not even trying
in vision, Claude Fable is on par with an *old 3B active Qwen* (Qwen-Flash is basically just hosted Qwen3.5-35B-A3B) that's all you get as a spillover from general scale
@teortaxesTex God is it a frozen vision encoder or something GOD why is Google mogging them so hard on this
in vision, Claude Fable is on par with an *old 3B active Qwen* (Qwen-Flash is basically just hosted Qwen3.5-35B-A3B) that's all you get as a spillover from general scale
@teortaxesTex i trust gemma4 26b more for vision than sonnets
@teortaxesTex God is it a frozen vision encoder or something GOD why is Google mogging them so hard on this

@teortaxesTex It is better in spatial reasoning. Vision is not their main property as of now, but I think this benchmark is heavily concentrated on a very narrow AI blindspot, which is more of a vision encoder benchmark than the model.
The score beats Claude Opus 4.7 by four points.
in vision, Claude Fable is on par with an *old 3B active Qwen* (Qwen-Flash is basically just hosted Qwen3.5-35B-A3B) that's all you get as a spillover from general scale