DHH says GPT-5.5 shows substantial gains on complicated agent work, reversing version 5.2's lag behind Opus and marking a recovery for OpenAI amid rising competition
Jason Liu quote-tweeted the post and Adam.GPT reposted it.
Positive users praise GPT-5.5 for strong gains in complex agent tasks over Opus while negative users question result reproducibility or call the claims overhyped.
No Digg Deeper questions have been answered for this story yet.
Most Activity
GPT-5.5 is a very good model
For complicated agent work, it's amazing how much GPT5.5 has improved. I found 5.2 to be very far behind Opus. Now using Opus 4.7 after 5.5 feels like a big step backwards. Gotta love this level of competion! Strong comeback for OpenAI.
Fact check: true
For complicated agent work, it's amazing how much GPT5.5 has improved. I found 5.2 to be very far behind Opus. Now using Opus 4.7 after 5.5 feels like a big step backwards. Gotta love this level of competion! Strong comeback for OpenAI.

The Omarchy 4 branch is now 30,000 lines of new code. The majority of it was written by GPT5.5. It's been so, so good at QML. You still need to review, but there's just no way this scale of a conversion would be feasible without AI in a reasonable time. https://github.com/basecamp/omarchy/pull/5856
GPT-5.5 cranking out 30k lines of QML for the Omarchy 4 branch + nailing subtle agentic reasoning!!
For complicated agent work, it's amazing how much GPT5.5 has improved. I found 5.2 to be very far behind Opus. Now using Opus 4.7 after 5.5 feels like a big step backwards. Gotta love this level of competion! Strong comeback for OpenAI.
thnx
For complicated agent work, it's amazing how much GPT5.5 has improved. I found 5.2 to be very far behind Opus. Now using Opus 4.7 after 5.5 feels like a big step backwards. Gotta love this level of competion! Strong comeback for OpenAI.

But what impresses me just as much is how good it is at explaining MY OWN CODE to me when working on Basecamp! Especially delicate JavaScript interactions with lots of subtle nuances. Real glimpses of AGI there.
It's amazing how many people I end up having similar opinions to simply by being in the arena with an open mind
Fact check: true

@dhh You can also increase the max depth and max threads to have more sub agents and define them by reasoning effort, codex is very cool

@gdb OpenAI Korea B2C "ME"

@dhh Is 4.0 on Dev or Edge yet?

@jzetterman Neither. Still in wild flux. Will probably come to dev in a few weeks.

@dhh Agreed. 5.5 is a noticeable advance over anything previous.

@dhh yes

@dhh I don’t know where it goes from here but if 5.5 codex is the last ai model to ever be released I’d be fine with that, it’s amazing.

@gdb @gdb can you call the next model Goblin?

@dhh opus 4.7 doesn't exist yet, maybe double check which models you're actually comparing

@dhh I need to test GPT5.5 On my Ruby like language made with rust. Right now using deepseek / opus 4.7 & minimax ...

@theramblingfool @dhh Ohh maybe I need to use the codex app instead of OpenCode with GPT 5.5 and the web based thing.

30,000 lines of agent-generated QML in one branch. the model quality debate is interesting but the real story here is that the review bottleneck is now the only bottleneck. at what point does "you still need to review" become physically impossible at this scale? nobody is reviewing 30,000 lines meaningfully.

@gdb this...