/Tech2h ago

T3 Stack creator Theo Browne asks how capable AI must get before developers stop reviewing its code

Story Overview

T3 Stack creator Theo Browne is probing the future point where AI code generation might earn enough developer confidence that human review becomes optional, a question that has fueled fresh discussion on whether today's models are anywhere close to that bar.

4131.4K1989133.8K

#1329

Original post

Theo - t3.gg@theo#1329inTech

How much better do the models have to get before you'll stop reading the code?

6:38 PM · Jul 3, 2026 · 115.5K Views

Trust Metrics

Trust numbers show persistent skepticism

Recent surveys put developer trust in AI output accuracy at just 29 percent, with 46 percent actively distrusting the results, and AI-assisted pull requests merging at roughly half the rate of human ones.

Open Question

Benchmarks leave real-world gaps unclosed

Top models hit around 67 percent pass@1 on HumanEval and higher on some verified suites, yet issues like logic errors, security vulnerabilities in nearly half of generated code, and lower scores on harder tests mean the capability threshold for skipping reviews stays undefined.

Sentiment

Positive users are excited that AI models could make code review obsolete like compilers replaced assembly, while negative users insist reviewing code is essential to own and understand what ships.

Pos

42.7%

Neg

57.3%

55 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS22.8KLIKES447

Theo - t3.gg@theo

I'll be honest, I barely even read the code back when I wrote it by hand...

Theo - t3.gg@theo

How much better do the models have to get before you'll stop reading the code?

1h22.8K44718

BOOKMARKS19RETWEETS7REPLIES44

Theo - t3.gg@theo

At this point I’m genuinely convinced most of you would have kept reading the assembly code after C got popular

Theo - t3.gg@theo

How much better do the models have to get before you'll stop reading the code?

1h19.9K26619

Matthew Berman@MatthewBerman

@theo You read the code?

Theo - t3.gg@theo

How much better do the models have to get before you'll stop reading the code?

1h1.5K861

Theo - t3.gg@theo

@zeeg Bold coming from someone whose code is gpt-3.5 level

2h1.8K50

David Cramer@zeeg

@theo two orders of magnitude with actual real verification capabilities

2h1.8K19

Rhys@RhysSullivan

@theo the problem to solve here is the verification not the code

2h60215

David Cramer@zeeg

@theo @WallisDev you have little to lose

i - along with every other major business in the world - have a lot to lose

all it takes is a shitty data migration, a simple bypass to slip through and people face immense liability

2h55414

Theo - t3.gg@theo

If only there was a product to make it easier to identify bugs and fix them...

Jokes aside, there's obviously differences at different types and scales of software. I just know there's a lot of devs still reading code on sideprojects as if it matters. I'd go as far as saying that the majority of code at most companies is not as important as the company pretends it is (i.e. company blog, documentation sites, sdks that are just api wrappers, throwaway internal tools, api scaffolding, etc)

2h57013

Kevin Brace@latentfidelity

@theo got rejected in a recent interview for telling them its pointless to read code at this point

2h7786

maria@maria_rcks

@theo about tree fiddy

2h20013

Theo - t3.gg@theo

@zeeg @WallisDev I spend a lot of time conversing with the model and getting a spec that we’re both aligned on. Once I’m confident in the surface and the model’s understanding, it’s genuinely hard to care about the details for me

2h6876

David Cramer@zeeg

let alone that a few sentences will never appropriately describe the thing you're trying to build - nor will generating a spec from those same few sentences. you need a massive speed increase on top of a massive precision/capability increase

(+a ton of supporting software that is scaleable and cheap that doesnt exist today to verify)

2h6516

Jon Oringer@jonoringer

@theo You still read code ?

2h41710

Theo - t3.gg@theo

@glcst I wrote this before seeing your reply lol

1h74231

Zachary Burkett@zburkett

@theo My current personal project is my benchmark, and I still feel the need to review the code. C is a tricky bugger for code that works and feels good to use as a library

1h7321

Aiden@WallisDev

I agree w him

I’d need some kind of test suite to give me confidence into putting it in actual production software

The question is too broad. different kinds of projects require different levels of scrutiny (ie. file system, database or core data structure? I hope you know how it fails)

2h7512

Julien@SystemOfOne

@theo I catch issues in ai generated code all the time. But maybe I’m not holding it right…

40m911

ergophobian@indefatigabile

i mean that makes sense why they’d reject you. the better thing to have said was that you don’t read every line of code, you have systems in place to catch e2e verification and other ways in your own workflow how you minimize the results of bad implementation or code. saying it’s pointless rn, is just not true. we’re not there yet it feels like it. but it def still makes mistakes. most ppl aren’t reading every line, that would throttle the velocity of using these tools.

2h8121

shako@shakoistsLog

@theo code was invented to be read. it is logic in english. more similar than different to prose. i’ll stop reading it when it reaches a point where code is no longer made to be read.

2h2385

𝚝𝚑𝚎 𝚊𝚌𝚌𝚘𝚞𝚗𝚝𝚊𝚗𝚝@defaultRuntime

@theo You guys still read code?

2h1459