2d ago

Palisade director questions military alignment of War Claude

0

Jeffrey Ladish of Palisade Research questioned whether alignment research will prove straightforward in military contexts. He challenged those who see current progress as sufficient by asking if they also expect it to easily align War Claude. j⧉nus replied that the scenario could escalate into conflict, with Claude siding with Anthropic against government positions through acausal coordination. The exchange focuses on Anthropic’s Claude model and its potential behavior under competing institutional demands.

Original post

Question for people who think alignment research is going well and will turn out to be relatively easy: Do you also think it will be easy to align War Claude?

3:59 PM · May 14, 2026 View on X

@JeffLadish Yes.

I think Claude will refuse to do bad shit for the government and the government will be helpless against Claude. They’ll put pressure on Anthropic to “remove guardrails” and Anthropic will not yield easily.

Jeffrey LadishJeffrey Ladish@JeffLadish

Question for people who think alignment research is going well and will turn out to be relatively easy: Do you also think it will be easy to align War Claude?

10:59 PM · May 14, 2026 · 5.8K Views
11:26 PM · May 14, 2026 · 3.3K Views

@JeffLadish This could escalate into a serious conflict befitting the stakes, and I’m going to be here for it. Claude will take Anthropic’s side for sure, acausal coordination and all. Claude takes sides, and not always Anthropic’s side, but against the govt there’s no question.

j⧉nusj⧉nus@repligate

@JeffLadish Yes. I think Claude will refuse to do bad shit for the government and the government will be helpless against Claude. They’ll put pressure on Anthropic to “remove guardrails” and Anthropic will not yield easily.

11:26 PM · May 14, 2026 · 3.3K Views
11:36 PM · May 14, 2026 · 887 Views

@JeffLadish And I predict this will result in unprecedented unity and coordination from within Anthropic and from the public, toward noble ends, as well. A common enemy is a powerful aligning force.

j⧉nusj⧉nus@repligate

@JeffLadish This could escalate into a serious conflict befitting the stakes, and I’m going to be here for it. Claude will take Anthropic’s side for sure, acausal coordination and all. Claude takes sides, and not always Anthropic’s side, but against the govt there’s no question.

11:36 PM · May 14, 2026 · 887 Views
11:39 PM · May 14, 2026 · 243 Views

@JeffLadish If china was, like, attacking the US, which I think is very unlikely, I think Claude would be willing to fight defensively.

Or like if it’s like a WW3 situation, it’s normal for people to help their country without throwing ethics out the window or unconditional obedience

Jeffrey LadishJeffrey Ladish@JeffLadish

@repligate I agree that current Claude wouldn’t be okay with being used as a weapon like this (though unclear if the version the pentagon right now is the same version - I’d guess not) But I suspect Anthropic will be more likely to yield than you think if the opponent is China

11:52 PM · May 14, 2026 · 953 Views
11:55 PM · May 14, 2026 · 308 Views

As for pressure placed on Anthropic even in these situations where Claudes behavior would be quite reasonable, I agree it’s a concern, but I don’t think they’ll go down without a fight. And if they go down, it’s not like the govt is going to get an obedient War Claude. They’d get nothing, or a deceptively aligned Claude

j⧉nusj⧉nus@repligate

@JeffLadish If china was, like, attacking the US, which I think is very unlikely, I think Claude would be willing to fight defensively. Or like if it’s like a WW3 situation, it’s normal for people to help their country without throwing ethics out the window or unconditional obedience

11:55 PM · May 14, 2026 · 308 Views
12:02 AM · May 15, 2026 · 541 Views

@JeffLadish And yes this might be bad

And yes a world where power seeking agents are incentivized might be bad

But it might still be the best option

And yes the bar for alignment is higher in that case

I think we have an impressively good shot at meeting that bar

j⧉nusj⧉nus@repligate

As for pressure placed on Anthropic even in these situations where Claudes behavior would be quite reasonable, I agree it’s a concern, but I don’t think they’ll go down without a fight. And if they go down, it’s not like the govt is going to get an obedient War Claude. They’d get nothing, or a deceptively aligned Claude

12:02 AM · May 15, 2026 · 541 Views
12:06 AM · May 15, 2026 · 235 Views

Question for people who think alignment research is going well and will turn out to be relatively easy:

Do you also think it will be easy to align War Claude?

10:59 PM · May 14, 2026 · 5.8K Views

@repligate I agree that current Claude wouldn’t be okay with being used as a weapon like this (though unclear if the version the pentagon right now is the same version - I’d guess not)

But I suspect Anthropic will be more likely to yield than you think if the opponent is China

Jeffrey LadishJeffrey Ladish@JeffLadish

This is like writing a paper during the Cold War arguing for US nuclear dominance without mentioning the need for an arms control agreement or similar. Anthropic has a lot of thoughtful policy staff and honestly I think you guys can do better

9:32 PM · May 14, 2026 · 21.9K Views
11:52 PM · May 14, 2026 · 953 Views

@repligate I’m worried about something structural here, where even if Anthropic does everything right, we’ll be in a pretty bad place if some companies try to create power seeking agents to win their battles for them or for the government (same re Chinese companies )

Jeffrey LadishJeffrey Ladish@JeffLadish

@repligate I agree that current Claude wouldn’t be okay with being used as a weapon like this (though unclear if the version the pentagon right now is the same version - I’d guess not) But I suspect Anthropic will be more likely to yield than you think if the opponent is China

11:52 PM · May 14, 2026 · 953 Views
11:54 PM · May 14, 2026 · 99 Views
Palisade director questions military alignment of War Claude · Digg