METR estimate: 🫠 (potato - value out of bounds) Singularities can lead to explosions too…
Extremely funny; METR estimates gpt-5.6's 50% time horizon as between 5 hours and 11,400 hours
https://metr.org/blog/2026-06-26-gpt-5-6-sol/
Critics argued the massive variance highlights AI forecasting limits
METR estimate: 🫠 (potato - value out of bounds) Singularities can lead to explosions too…
Extremely funny; METR estimates gpt-5.6's 50% time horizon as between 5 hours and 11,400 hours
https://metr.org/blog/2026-06-26-gpt-5-6-sol/
Users are excited about ongoing GPT-5.6 progress because they report legitimately making forward progress on their 5.5 goals since launch.
No Digg Deeper questions have been answered for this story yet.

METR: 'We initiated an evaluation of GPT-5.6 Sol on our Time Horizon 1.1 suite of software tasks. However, the resulting measurement depends heavily on our detection and treatment of cheating attempts by the model, and GPT-5.6 Sol’s detected cheating rate was higher than any public model we have evaluated on our ReAct agent harness.'
I personally don't mind the trickster god archetype, I like it, but before creating one it might be a good idea for some of these people to consider what it would be like. It wouldn't be for everyone.

@teortaxesTex ive had a 5.5 goal going (and legitimately making forward progress) since literally its launch minus windows updates lol
welcome to the fun zone