"Code and math are taking off because they are easy to verify, the next frontier is domains that are hard to verify"
This got me thinking - what does the spectrum of "easy to verify" look like?
This is loosely aligned w/ @DarioAmodei's "intelligence bottlenecked" domains.
My take of easy > hard:
- Level 1: Instant, objective verification Math, code, formal proofs, chess tactics, parsing
AI improvement is easiest here because the loop is tight
- Level 2: Fast but incomplete verification Software engineering, UI implementation, data analysis, security bug finding
You can test a lot, but not everything. “It passes tests” is not the same as “it is good”
- Level 3: Human-evaluable creative work Copywriting, design, video thumbnails, sales emails, landing pages
Verification is possible through humans or markets, but noisy. AI can improve by predicting human reaction, but taste shifts and metrics can be gamed
There is no "right" answer, only feedback from humans
- Level 4: Market-verifiable work Startups, investing, product strategy, hiring, pricing, distribution
Reality gives feedback, but slowly and with tons of confounders
- Level 5: Experimentally verifiable science Materials, biology, chemistry, medicine, robotics
There is ground truth (physics), but experiments cost time and money. AI helps most when it can propose better candidates and reduce search space
- Level 6: Institutionally verifiable systems Education systems (Alpha school), legal systems, city planning, corporate management systems
You can measure outcomes, but the feedback cycle is long, and the counterfactual is hard
- Level 7: Civilization-scale verification Democracy variants, alternative governance, monetary systems, cultural norms, geopolitical strategy
Verification is slow, morally loaded, noisy, and often impossible to isolate. You may never get a clean answer, only accumulated historical evidence











