OpenAI Codex lead Thibault Sottiaux questions if developers still trust traditional benchmarks over peer recommendations
Creator Alex Volkov says builder reputation outweighs standardized metrics.
——0——
Creator Alex Volkov says builder reputation outweighs standardized metrics.
Users distrust AI benchmarks as unreliable or deceptive, with some favoring Codex or GPT-5.5 from personal testing while others criticize Opus for being slow and expensive.
16 comments with sentiment.