OpenAI Codex engineering lead Thibault Sottiaux asks developers for tasks the AI coding model still cannot perform
Story Overview
Thibault Sottiaux, engineering lead for OpenAI's Codex agentic coding system, posted on X asking developers to name capabilities that still feel surprisingly out of reach despite years of progress, drawing over 570 replies that zero in on recurring friction points.
Context limits force manual resets
Replies repeatedly call out the model's habit of telling users to start a new chat or clear history once the window fills, rather than managing compaction or long-context tasks automatically.
Benchmark shortfalls linger in key workflows
Developers note Codex should already outperform tools like Fable in targeted areas yet continues to miss the mark on quality and orchestration tasks that were expected to improve sooner.
Many users criticized Codex for its terrible UI design, weak frontend capabilities, and flawed review process, with some saying they won't renew subscriptions.
No Digg Deeper questions have been answered for this story yet.
Most Activity
@thsottiaux be better than fable
What is something that you feel is surprising that Codex still can't do well and we should have gotten right a while ago?

@geluhorotan @thsottiaux Try this : https://github.com/Pythoughts-labs/css-pro-tips.git. you will thank me later
@thsottiaux push to git reliably performance
What is something that you feel is surprising that Codex still can't do well and we should have gotten right a while ago?

@thsottiaux I don’t think this is really a Codex problem. I think it is more of a model-level problem.
My guess is that, during GPT’s model training, there is a large amount of highly theoretical STEM training data, and that data probably carries a relatively high weight. The result is some

@thsottiaux Long-context tasks. GIVE US 1M IN CODEX!!!

@thsottiaux Does having a working version on Linux count? A bit cheeky.. but, please?
@thsottiaux Deep research.
What is something that you feel is surprising that Codex still can't do well and we should have gotten right a while ago?

@thsottiaux I would love if it could automatically teach itself skills and behaviours like Hermes agent does.
It would massively save on time since I don't have to teach it anything myself. It can teach itself.

@thsottiaux Lack of a personality and a need of handholding to one shot things. Fable just gets you instantly and you can be comfortable that it’ll do a great job with no further guidance needed

In game development, codex is really bad at reviewing visual results of things he is doing, like for example, building a cockpit in my mech game, it just sees the screenshot, and reads it like "theres panels, ok so it worked" it should be able to analyse the scene deeply, position, angles etc... thats the secret for AGI, its visual understanding, soon as ai understands visually the output of what it is doing, then it can correct it.

@thsottiaux Currently, a Codex project can only contain a single folder. Codex should support multi-folder writing.

@thsottiaux I didn’t get to try 5.6 yet, but as of 5.5, codex cannot in any realm write human-friendly documentation. It leaks its instructions and inner monologue straight into readme files and frontend widgets. It’s terrible

@thsottiaux File preview for things on codex mobile. why is it so different on ChatGPT ? ChatGPT can output perfect file previews in iOS but majority file types in codex mobile don’t open Properly. ( iOS 27 dev beta 2 don’t know if that matters )

@thsottiaux Optimise the "Usage remaining" Ui/Ux experience.
Take it from two clicks down to one click.
Have it update more frequently, even live if you can.

@thsottiaux @grok what % of these comments say design on this post and every other similar request by @thsottiaux

@thsottiaux I want to write a little different, but I think you'll like it, correcting the post for the third time 👇
@thsottiaux Thinking the last steer I sent during a goal is still new and urgent 8 hours later during a long /goal - specifically after every compaction. I have to add a dummy message like “Good I’m glad you did that” afterward
What is something that you feel is surprising that Codex still can't do well and we should have gotten right a while ago?

@thsottiaux File editing and search. Please let me edit files, I don't want to open an IDE.

@thsottiaux Disagree with you if you’re wrong! I love using codex/chatgpt for brainstorming but it’s hard when it just agrees with me

@thsottiaux I feel like it struggles when generating images. ChatGPT seems to generate more accurate higher quality images compared to when I ask codex to do it. Not sure why