/AI5h ago

Google DeepMind's Andreas Kirsch and AI safety specialist Charles Foster argue instruction following does not prevent harmful recursive AI self-improvement

Foster warns unaligned systems can still undergo recursive self-improvement

1100383

#228

Original post

Andreas Kirsch 🇺🇦@BlackHC#228inAI

@willccbb @yong_zhengxin Instruction following is not alignment and not AI safety

You can have a model RSI on producing a new virus that kills everyone following the above, and that is very much an alignment failure in the AI safety sense ☹️

9:44 PM · Jun 9, 2026 · 342 Views

/AI5h ago

Google DeepMind's Andreas Kirsch and AI safety specialist Charles Foster argue instruction following does not prevent harmful recursive AI self-improvement

Foster warns unaligned systems can still undergo recursive self-improvement

1100383

#228

Original post

Andreas Kirsch 🇺🇦@BlackHC#228inAI

@willccbb @yong_zhengxin Instruction following is not alignment and not AI safety

You can have a model RSI on producing a new virus that kills everyone following the above, and that is very much an alignment failure in the AI safety sense ☹️

9:44 PM · Jun 9, 2026 · 342 Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Posts from X

Most Activity

VIEWS43LIKES1

Charles Foster@CFGeek

@willccbb @yong_zhengxin > fortunately, alignment is a precondition for RSI

Are you saying that loss of control via RSI is not a possibility? Since you can’t do RSI in the first place without alignment?

4h4310