In 2025, Flash-sized models used to be so good it never made sense to use Pro on everyday queries. Bigger models were designed to attain the frontier and distill into the actual workhorse model. The bigger model is relegated to a tail of niche cases.
These days, small models are being priced out. Frontier performance has a huge impact on real-world (coding) productivity. And it's hard to make a case for Flash/Sonnet models when we have big open-weight models that perform better and are priced cheaper.
The world is shifting to biggest closed-source models + biggest open-weight models. The cheeky takeaway is that the distillation student moved from a small closed-source model to a big open-weight model.