Reasoning Efficiency Advances In Latest AI Models Like GPT-5.5
I was just saying: nowadays "higher token budgets feel worth it"
I think this is a change that happened very recently

My definition of model intelligence has been very clear over the past two years. For me the sign of an intelligent model was always good results with as few resources as possible, which is why I was a big fan of Sonnet 3.5/3.6 and Opus models. These models would just get things and one-shot problems "without thinking". On the other hand I really disliked reasoning models from o1-preview up until o3, because it just wasn't worth it back then and felt like inelegant brute-force slop. You would get slightly better results for 10x the cost. Later from GPT-5 up to GPT-5.2 the reasoning budgets exploded from a few thousand tokens to 50-100k tokens. Since then reasoning efficiency has only improved, and we are now living in a world where GPT-5.5 and Mythos get insane results with very low reasoning budgets and where higher token budgets feel worth it. I think part of this is also that models nowadays know how much reasoning to spend on each problem. So when you set reasoning effort to xhigh it doesn't think for 100k tokens on a very easy problem just for the sake of the xhigh setting. (but personally I still use medium thinking budget like 90% of the time and will only go up to xhigh when the tasks have a high enough skill ceiling. it's overkill to use xhigh for everything)