3h ago

New Provable Framework Enables Corrigible AI Systems That Preserve Oversight

0
Original post

Check out Kolawole's new work, which builds on some of the results of my recent AAAI '26 corrigibility paper (linked below in the next post👇), such as my Prop 4 result that no safety filter can ever guarantee safety across all agents and environments!

2:35 PM · May 28, 2026 · 429 Views