Y Combinator's Ankit Gupta says AI companies selectively avoid competitor benchmarks to obscure direct model performance comparisons
Opus 4.8 and GPT 5.5 shared only one overlapping benchmark.
——0——
QUOTE POST
#980Lisan al Gaib@SCALING01
average slop account posting a "comparison" with 15 benchmarks
but only 1 benchmark is directly comparable
the other 14 are just there without anything to compare them to
entertaining that model launch posts seem to avoid using the same benchmarks as their competitor's last model release. Here are the models eval'd in today's Opus 4.8 and 4 weeks ago's GPT 5.5
5:05 PM · May 28, 2026 · 43.1K Views
8:16 PM · May 28, 2026 · 6.2K Views