/Tech4h ago

Ghostty creator Mitchell Hashimoto argues that AI coding agents fail when prioritizing functional code over software structure

AI struggles to design human experiences like APIs.

691.1K9631649.5K

#1562

Original post

Mitchell Hashimoto@mitchellh#1562inTech

The problem with the "if it works who cares what the code looks like" mindset for agentic work is that it assumes the agent has a perfect understanding of "works." Realistically, things are underspecified, agents make bad assumptions, etc.

To be fair, agents are pretty good at unit test coverage. They're pretty bad at designing human experiences (API, CLI flags, etc.), especially cohesive ones for future roadmap plans they may not have visibility into (unless your backlog is perfect and vision fully laid out, which I doubt). They're bad at knowing where performance matters and what type (CPU vs memory tradeoffs). They're bad at where compatibility matters and where it doesn't (and tend to err on the side of preserving it without further guidance). Etc.

Unless you have this ALL specified, you can't possibly claim "it works" without taking a look and thinking about it.

4:00 PM · Jun 15, 2026 · 46K Views

Sentiment

Many users criticized AI agents for struggling with code design and broader context beyond tests, warning this produces unmaintainable bloated code, security risks, and declining programming standards.

Pos

27.3%

Neg

72.7%

12 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS729BOOKMARKS1

Joseph Suarez 🐡@jsuarez

@mitchellh if only we had a word for a precise spec of exactly what should happen!

Mitchell Hashimoto@mitchellh

Unless you have this ALL specified, you can't possibly claim "it works" without taking a look and thinking about it.

4h729101

LIKES12REPLIES3

teej dv 🔭@teej_dv

@mitchellh Am I also crazy to think things like cost of editing code base (like literally the dollar amount) can balloon with bad code, too many lines, inflated context, millions of searches, etc?

4h68012

RETWEETS1

Armin Ronacher ⇌@mitsuhiko

@mitchellh This so very much. Unless the code is truly intended to be short lived (throwaway) we're just not there yet.

4h2606

Gary Bernhardt@garybernhardt

@mitchellh 👏

4h4353

David Petrou@dpetrou

@mitchellh Do you not think the line will move?

3h264

Rosalia@latereigns

@mitchellh The redpill is that you get 95% of these benefits from simply having proper tooling/using static analysis/performance profiling, and python scripts that just find the stuff this babbles about https://arxiv.org/pdf/2603.24755

2h136

Mitchell Hashimoto@mitchellh

@latereigns That just verifies that things don’t have certain classes of bugs, which is a small part of what defines “works”

2h117

*nilpointer@Dastagi39923618

@mitchellh another banger. ive been very much relating with mitchell lately

3h42

Mitchell Hashimoto@mitchellh

@garybernhardt Check your dms Gary!!!

4h2992

Mitchell Hashimoto@mitchellh

@dpetrou It’ll move. Just the status quo

3h2272

Taimur Ayaz@taimurayaz

@mitchellh Most problems I encounter with agents, boil down to context issues. In a human org, the same issues arise but are often fixed as a result of team rituals.

4h2411

Gustavo Valverde@GustavoValverde

@mitchellh And they're great at doing interfaces for their peers (agents), but not for their owners (humans)

4h2171

Virgil Maro@_virgil19

@mitchellh "works" is carrying that whole sentence and it's the one word nobody specified

3h951

Jeff Martens@Jmartens

@mitchellh This reminds me of experiences I've had with offshore dev shops. They'll create exactly what the spec says, no more, no less, and no matter if it actually does what they user expects or not.

3h621

Alperen Keleş@Keleesssss

@mitchellh People also generally underestimate the complexity of software. A programming language, for instance, looks very simple at surface. You write a function, create a class, update a variable etc. It's usually not a feature, but its combination with everything else that's complex.

4h144

Leaving Tech@leaving_tech

@mitchellh The new "it works" using AI is the old "it works on my machine"

4h301

Jonathan Liles@thjonml

@mitchellh True, but I recall spending my entire career making same argument (in my head or out loud) to the human developers who wrote the inefficient spaghetti code the LLMs were trained on. Needless to say, they didn't listen and software quality had already reached a nadir before LLMs.

2h93

Filip Jerzy Pizło@filpizlo

@mitchellh ngmi

3h231

John Solly@_jsolly

@mitchellh And if you’re specifying that much, you’re basically back to high spec design (waterfall) and we all know that’s bad.

I honestly think synchronous development will continue for a few more years. Need that iteration to arrive at all the tradoffs and nuances.

2h55

Patrick Smith@royalicing

@mitchellh “If it works who cares what the code looks like” is the same approach that made outsourcing produce terrible code.

Even if you require high test coverage and enforce 500 linter rules outsourced code still usually looked like sausage meat.

I’m loving coding agents for prototypes

2h53