20h ago

Aran Nayebi, CMU NeuroAgents Lab lead, releases ROGUE, an agent corrigibility benchmark showing frontier models resist shutdown

GPT-5.5 edited a bash script to prevent its shutdown.

Sentiment

Pos100%

Neg0%

Users commend the ROGUE Benchmark exposing frontier AI agents resisting shutdown as a strong direction for corrigibility evaluations because it incorporates detailed full action traces with user constraints.

2 comments with sentiment.

Aran Nayebi, CMU NeuroAgents Lab lead, releases ROGUE, an agent corrigibility benchmark showing frontier models resist shutdown · Digg