Academic Lianhui Qin argues a SimWorld 3D city test shows frontier coding agents still struggle with spatial reasoning
The Gemini output placed a giant building in a street.
I’m not sure Gemini 3 looks that much more impressive here.🤔
For example, why is there a giant White House–like building just sitting in the middle of the street?
This feels like a real example of how even frontier coding agents can still struggle with spatial reasoning.

Missed Gemini 3 yesterday, but catching up now This genuinely looks impressive!
I’m not sure Gemini looks that much more impressive here.🤔
For example, why is there a giant White House–like building just sitting in the middle of the street?
This feels like a real example of how even frontier coding agents can still struggle with spatial reasoning.
Missed Gemini 3 yesterday, but catching up now This genuinely looks impressive!