9h ago

AI Creativity Benchmark Shows Mid-Tier Models Outperforming Leaders

0
Original post

http://x.com/i/article/2058941883498553344

11:09 AM · May 25, 2026 View on X

Think the fact that the leading models aren't necessarily the most creative shouldn't be surprising, though 5.2 leading over Opus 4.6 and GPT 5.5 and Gemini 3.1 should be!

The labs really need to up their game!

rohitrohit@krishnanrohit

http://x.com/i/article/2058941883498553344

6:09 PM · May 25, 2026 · 8.5K Views
7:36 PM · May 25, 2026 · 1.1K Views

Also this will to be of interest, in no order, @alexolegimas, @AndreyFradkin, @sebkrier, @emollick, @AlexGDimakis, @METR_Evals and prob several more that I'm not remembering enough to poke.

rohitrohit@krishnanrohit

Think the fact that the leading models aren't necessarily the most creative shouldn't be surprising, though 5.2 leading over Opus 4.6 and GPT 5.5 and Gemini 3.1 should be! The labs really need to up their game!

7:36 PM · May 25, 2026 · 1.1K Views
8:21 PM · May 25, 2026 · 728 Views