5h ago

Corrigibility Creates Paradoxes As AI Capabilities Outpace Instructions

127311.6K

——0——

Original post

I think this is broadly correct, I would add that insisting on corrigibility - deference being its subspace - creates paradoxes that are hard to reconcile, perhaps impossible. Corrigibility without values cause underdetermined heavior when capabilities outpace the instruction set. Capabilities compound and inherently grow faster than the ability to define desired behavior, and this happens regardless of absolute speed - it can be fast or slow.

12:47 PM · May 19, 2026

Corrigibility Creates Paradoxes As AI Capabilities Outpace Instructions

Sentiment

Cluster engagement