2d ago

DeepMind scientist Alex Turner replies to Rob Wiblin

0

Alex Turner, a research scientist on Google DeepMind's scalable alignment team, replied to Rob Wiblin seeking clarification on which parties advocate bans on AI discussion. The exchange referenced an image from Wiblin's post and addressed policy debates. Wiblin argued that critics blaming open AI discourse for problems would logically support explicit bans once voluntary taboos proved unenforceable, exposing inconsistency in their stance. The thread linked to the Substack article AI #168 on incremental model gains and internal government arguments.

Original post

@robertwiblin Who is saying we should ban discussion? (real question)

9:53 AM · May 14, 2026 View on X

@ShakeelHashim @michaelsobolik I’m sorry but is that your view as a journalist that a company should soften its threat assessment to preserve diplomatic atmospherics on safety cooperation? Should journalists soften their facts/stories too? I find it strange.

11:31 PM · May 14, 2026 · 78 Views

@Turn_Trout Nobody directly but some discussion of the discourse here: https://thezvi.substack.com/p/ai-168-not-leading-the-future?open=false#%C2%A7i-learned-it-by-reading-you

Alex TurnerAlex Turner@Turn_Trout

@robertwiblin Who is saying we should ban discussion? (real question)

4:53 PM · May 14, 2026 · 454 Views
5:05 PM · May 14, 2026 · 209 Views

The reason I'm making fun of it is that for people who imply that this shows the people who discussed the issue are to blame, given that a voluntary taboo never would have gotten significant let alone universal compliance, a ban seems to be the actual implied proposal (and the fact that they'd oppose that speaks for itself).

Rob WiblinRob Wiblin@robertwiblin

@Turn_Trout Nobody directly but some discussion of the discourse here: https://thezvi.substack.com/p/ai-168-not-leading-the-future?open=false#%C2%A7i-learned-it-by-reading-you

5:05 PM · May 14, 2026 · 209 Views
5:11 PM · May 14, 2026 · 125 Views

@Turn_Trout If you filter the training dataset is the negative externality real?

Alex TurnerAlex Turner@Turn_Trout

@robertwiblin Yeah but it's in fact true that self-fulfilling misalignment is a negative impact of LW discourse. That doesn't mean we should ban the discourse or that it was overall bad to discuss. But the negative externality is real

6:50 PM · May 14, 2026 · 91 Views
7:31 PM · May 14, 2026 · 37 Views

@daniel_271828 @Turn_Trout @slatestarcodex Yeah I'm not following Alex - if someone's technical alignment strategy fails if someone publishes a sci-fi story about an AI doing something bad, doesn't that seems like a bad/fragile strategy?

Daniel Eth (yes, Eth is my actual last name)Daniel Eth (yes, Eth is my actual last name)@daniel_271828

@Turn_Trout @robertwiblin @slatestarcodex My point was not that your alignment strategy should be robust to including specific data in training (point 2 was even suggesting filtering this data out); it was that the strategy should be robust to people in the world having the conversation

7:49 PM · May 14, 2026 · 102 Views
7:40 AM · May 15, 2026 · 90 Views

@robertwiblin Who is saying we should ban discussion? (real question)

Rob WiblinRob Wiblin@robertwiblin
8:59 AM · May 13, 2026 · 6K Views
4:53 PM · May 14, 2026 · 454 Views

@robertwiblin Yeah but it's in fact true that self-fulfilling misalignment is a negative impact of LW discourse. That doesn't mean we should ban the discourse or that it was overall bad to discuss. But the negative externality is real

Rob WiblinRob Wiblin@robertwiblin

The reason I'm making fun of it is that for people who imply that this shows the people who discussed the issue are to blame, given that a voluntary taboo never would have gotten significant let alone universal compliance, a ban seems to be the actual implied proposal (and the fact that they'd oppose that speaks for itself).

5:11 PM · May 14, 2026 · 125 Views
6:50 PM · May 14, 2026 · 91 Views

@robertwiblin @slatestarcodex Or consider some responses which are like "if your alignment strategy isn't robust to this, it's dumb", which seems like an appeal to how people think the world should work (instead of how it might actually work)

Daniel Eth (yes, Eth is my actual last name)Daniel Eth (yes, Eth is my actual last name)@daniel_271828

This is such a mid take: 1) “rogue AI” is a common trope, not invented by LW 2) just filter out the relevant posts from training if that solves the problem then! 3) your alignment strategy should be robust to “someone, somewhere writes about the possibility of rogue AI”

8:32 PM · May 10, 2026 · 5.2K Views
7:11 PM · May 14, 2026 · 111 Views

(Not replying on X, engage at https://bsky.app/profile/turntrout.bsky.social)

Alex TurnerAlex Turner@Turn_Trout

4. It IS still true that the speculation had a negative externality (a "sociohazard", as it were). Just acknowledge the facts and move on. No need to be defensive about it. (Keeping in mind that MY early speculation is included here)

10:46 PM · May 14, 2026 · 617 Views
10:46 PM · May 14, 2026 · 450 Views

4. It IS still true that the speculation had a negative externality (a "sociohazard", as it were).

Just acknowledge the facts and move on. No need to be defensive about it. (Keeping in mind that MY early speculation is included here)

Alex TurnerAlex Turner@Turn_Trout

3. That doesn't mean it was wrong to speculate or that we should ban further speculation. (But be mindful with large data dumps: https://turntrout.com/dataset-protection... less responsible labs won't mitigate)

10:46 PM · May 14, 2026 · 677 Views
10:46 PM · May 14, 2026 · 617 Views

Lots of hubbub about "is LW to blame for self-fulfilling misalignment"

1. If a scientist builds a machine which does bad because people said it would, it's NOT the people's fault (morally)

2. Balance of evidence is that YES, LW & doom-speculation contributed to the problem

10:46 PM · May 14, 2026 · 6.6K Views

I don’t think the balance of evidence is in favor of your point 2! The only evidence people cite for this are some extremely vague words in a tweet thread.

In as much as anyone takes this seriously as a risk vector, things like “The Terminator movies” and associated discussion of AI in media seem more likely to contribute here, or maybe not, I don’t think anyone has actually given any concrete evidence here.

Alex TurnerAlex Turner@Turn_Trout

Lots of hubbub about "is LW to blame for self-fulfilling misalignment" 1. If a scientist builds a machine which does bad because people said it would, it's NOT the people's fault (morally) 2. Balance of evidence is that YES, LW & doom-speculation contributed to the problem

10:46 PM · May 14, 2026 · 6.6K Views
10:55 PM · May 14, 2026 · 825 Views

@Turn_Trout @robertwiblin @slatestarcodex My point was not that your alignment strategy should be robust to including specific data in training (point 2 was even suggesting filtering this data out); it was that the strategy should be robust to people in the world having the conversation

Alex TurnerAlex Turner@Turn_Trout

@robertwiblin @slatestarcodex Or consider some responses which are like "if your alignment strategy isn't robust to this, it's dumb", which seems like an appeal to how people think the world should work (instead of how it might actually work)

7:11 PM · May 14, 2026 · 111 Views
7:49 PM · May 14, 2026 · 102 Views
DeepMind scientist Alex Turner replies to Rob Wiblin · Digg