1d ago

Y Combinator's Ankit Gupta says AI companies selectively avoid competitor benchmarks to obscure direct model performance comparisons

Opus 4.8 and GPT 5.5 shared only one overlapping benchmark.

0
Original post

entertaining that model launch posts seem to avoid using the same benchmarks as their competitor's last model release. Here are the models eval'd in today's Opus 4.8 and 4 weeks ago's GPT 5.5

10:05 AM · May 28, 2026 View on X

average slop account posting a "comparison" with 15 benchmarks

but only 1 benchmark is directly comparable

the other 14 are just there without anything to compare them to

Ankit GuptaAnkit Gupta@agupta

entertaining that model launch posts seem to avoid using the same benchmarks as their competitor's last model release. Here are the models eval'd in today's Opus 4.8 and 4 weeks ago's GPT 5.5

5:05 PM · May 28, 2026 · 43.1K Views
8:16 PM · May 28, 2026 · 6.2K Views