/Tech5h ago

Google DeepMind's Andreas Kirsch argues instruction following does not guarantee alignment, warning against dangerous recursive self-improvement

Foster disputed that alignment is a prerequisite for self-improvement.

1100383

#478

Original post

Andreas Kirsch 🇺🇦@BlackHC#478inTech

@willccbb @yong_zhengxin Instruction following is not alignment and not AI safety

You can have a model RSI on producing a new virus that kills everyone following the above, and that is very much an alignment failure in the AI safety sense ☹️

9:44 PM · Jun 9, 2026 · 342 Views

/Tech5h ago

Google DeepMind's Andreas Kirsch argues instruction following does not guarantee alignment, warning against dangerous recursive self-improvement

Foster disputed that alignment is a prerequisite for self-improvement.

1100383

#478

Original post

Andreas Kirsch 🇺🇦@BlackHC#478inTech

@willccbb @yong_zhengxin Instruction following is not alignment and not AI safety

You can have a model RSI on producing a new virus that kills everyone following the above, and that is very much an alignment failure in the AI safety sense ☹️

9:44 PM · Jun 9, 2026 · 342 Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Posts from X

Most Activity

VIEWS43LIKES1

Charles Foster@CFGeek

@willccbb @yong_zhengxin > fortunately, alignment is a precondition for RSI

Are you saying that loss of control via RSI is not a possibility? Since you can’t do RSI in the first place without alignment?

4h4310