Philo Groves highlights CyberGym benchmark data showing OpenAI's GPT-5.5 scored 81% while default safety filters limited Claude to 0.9%
Strict safety alignment renders Claude largely ineffective for cyber tasks
——0——
I respect that they are so committed to the bit. Mythos is the God of Cyberwar, but you, chud, can't be trusted with it (not yet, at least. Maybe after the Colossus deployment). And Opus will be the babby of cyberwar. They are willing to lose some customers to OAI here.
hahaha 0.9% on cybergym with safeguards enabled (default), if you are working in cyber and using claude, anthropic just gave you the finger. GPT 5.5 scores 81% and growing.
5:13 PM · May 28, 2026 · 45.3K Views
9:41 PM · May 28, 2026 · 3.4K Views