I mean I agree, technically GPT 5.5 "won".
But it also destroyed the file. So, does that really count?
You may argue: "it wasn't in the prompt, your rules suck".
But if I wrote: → "keep the file size small"
It would just compress it, making it unreadable.
If I wrote: → "keep it small but do NOT minify it"
It would remove the documentation.
If I wrote: → "keep it small, do NOT minify, do NOT remove docs"
It would make variable names shorter.
And at this point I'm chasing a way to express "don't destroy the fucking file", but that concept is surprisingly hard to define. "Keep it readable" is ambiguous. GPT can literally read base64, so, that doesn't help either. "Keep the code clean, pretty" is subjective, it just ignores these. Add a linter and it WILL find a way to ruin the file that the linter doesn't catch.
That's the thing, it feels like no matter how many rules I add, GPT will find a way to succeed while also destroying the file, because it doesn't grasp the concept of permanence that a real project demands. It literally doesn't know what it means to maintain a file in the long term, because it wasn't trained on that. It just sees my code as a bunch of bytes, plus a goal that it must reach, no matter what.
So, how do you make it do useful work?
If I have to spell 500 rules in order to make it not destroy my codebase, then, at which point writing all the rules consumes more time than just DOING the work myself?
That's the problem.
With Opus, I didn't have to spell a single rule. I just asked it to make the file faster, and it made it faster, without destroying it. That's real useful work, a net positive, which GPT rarely gives me. This is hard to put in words, but I swear it happens to me at least











