This is exactly why I am so insane about transparency. It just tells you it can't optimize it further, and then all of a sudden it can
I gave Fable 5 one job: write custom WebGPU kernels for Gemma 4 inference.
It climbed to 84 tok/s, then hit a wall, insisting further optimization was impossible.
Hours later, Anthropic rolled back invisible LLM development safeguards, and it hit 255 tok/s.
The next day, access to Fable 5 was suspended globally.













