/Tech15h ago

Google DeepMind's Andreas Kirsch and AI safety specialist Charles Foster argue instruction following does not prevent harmful recursive AI self-improvement

Foster warns unaligned systems can still undergo recursive self-improvement

1300572
Original post

@willccbb @yong_zhengxin Instruction following is not alignment and not AI safety

You can have a model RSI on producing a new virus that kills everyone following the above, and that is very much an alignment failure in the AI safety sense ☹️

9:44 PM · Jun 9, 2026 · 484 Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS88LIKES3

@willccbb @yong_zhengxin > fortunately, alignment is a precondition for RSI

Are you saying that loss of control via RSI is not a possibility? Since you can’t do RSI in the first place without alignment?

14hViews 88Likes 3Bookmarks 0