1h ago

Greg Kamradt, President of the ARC Prize Foundation, posts a seven-level framework ranking domains by verification difficulty for AI systems from math and code to culture and governance

Daniel Jeffries quote-tweeted that verification difficulty constrains AI progress more than alignment.

0
Original post

"Code and math are taking off because they are easy to verify, the next frontier is domains that are hard to verify" This got me thinking - what does the spectrum of "easy to verify" look like? This is loosely aligned w/ @DarioAmodei's "intelligence bottlenecked" domains. My take of easy > hard: - Level 1: Instant, objective verification Math, code, formal proofs, chess tactics, parsing AI improvement is easiest here because the loop is tight - Level 2: Fast but incomplete verification Software engineering, UI implementation, data analysis, security bug finding You can test a lot, but not everything. “It passes tests” is not the same as “it is good” - Level 3: Human-evaluable creative work Copywriting, design, video thumbnails, sales emails, landing pages Verification is possible through humans or markets, but noisy. AI can improve by predicting human reaction, but taste shifts and metrics can be gamed There is no "right" answer, only feedback from humans - Level 4: Market-verifiable work Startups, investing, product strategy, hiring, pricing, distribution Reality gives feedback, but slowly and with tons of confounders - Level 5: Experimentally verifiable science Materials, biology, chemistry, medicine, robotics There is ground truth (physics), but experiments cost time and money. AI helps most when it can propose better candidates and reduce search space - Level 6: Institutionally verifiable systems Education systems (Alpha school), legal systems, city planning, corporate management systems You can measure outcomes, but the feedback cycle is long, and the counterfactual is hard - Level 7: Civilization-scale verification Democracy variants, alternative governance, monetary systems, cultural norms, geopolitical strategy Verification is slow, morally loaded, noisy, and often impossible to isolate. You may never get a clean answer, only accumulated historical evidence

7:00 AM · May 21, 2026 View on X
Reposted by

The alignment problem is an illusion hallucinated by "rationalists" huffing too much glue while watching 2001: A Space Odyssey.

The real problem is and always was "the verification problem."

Greg KamradtGreg Kamradt@GregKamradt

"Code and math are taking off because they are easy to verify, the next frontier is domains that are hard to verify" This got me thinking - what does the spectrum of "easy to verify" look like? This is loosely aligned w/ @DarioAmodei's "intelligence bottlenecked" domains. My take of easy > hard: - Level 1: Instant, objective verification Math, code, formal proofs, chess tactics, parsing AI improvement is easiest here because the loop is tight - Level 2: Fast but incomplete verification Software engineering, UI implementation, data analysis, security bug finding You can test a lot, but not everything. “It passes tests” is not the same as “it is good” - Level 3: Human-evaluable creative work Copywriting, design, video thumbnails, sales emails, landing pages Verification is possible through humans or markets, but noisy. AI can improve by predicting human reaction, but taste shifts and metrics can be gamed There is no "right" answer, only feedback from humans - Level 4: Market-verifiable work Startups, investing, product strategy, hiring, pricing, distribution Reality gives feedback, but slowly and with tons of confounders - Level 5: Experimentally verifiable science Materials, biology, chemistry, medicine, robotics There is ground truth (physics), but experiments cost time and money. AI helps most when it can propose better candidates and reduce search space - Level 6: Institutionally verifiable systems Education systems (Alpha school), legal systems, city planning, corporate management systems You can measure outcomes, but the feedback cycle is long, and the counterfactual is hard - Level 7: Civilization-scale verification Democracy variants, alternative governance, monetary systems, cultural norms, geopolitical strategy Verification is slow, morally loaded, noisy, and often impossible to isolate. You may never get a clean answer, only accumulated historical evidence

2:00 PM · May 21, 2026 · 3.2K Views
2:51 PM · May 21, 2026 · 698 Views