2d ago

Palisade director questions military alignment of War Claude

3827401612.4K

——0——

Jeffrey Ladish of Palisade Research questioned whether alignment research will prove straightforward in military contexts. He challenged those who see current progress as sufficient by asking if they also expect it to easily align War Claude. j⧉nus replied that the scenario could escalate into conflict, with Claude siding with Anthropic against government positions through acausal coordination. The exchange focuses on Anthropic’s Claude model and its potential behavior under competing institutional demands.

Original post

Jeffrey Ladish#1170@JEFFLADISH

Question for people who think alignment research is going well and will turn out to be relatively easy: Do you also think it will be easy to align War Claude?

3:59 PM · May 14, 2026

Cluster Engagement

Engagement snapshots are unavailable for this cluster.no post metric buckets

#516j⧉nus@REPLIGATE

@JeffLadish Yes.

I think Claude will refuse to do bad shit for the government and the government will be helpless against Claude. They’ll put pressure on Anthropic to “remove guardrails” and Anthropic will not yield easily.

Jeffrey Ladish@JeffLadish

Question for people who think alignment research is going well and will turn out to be relatively easy: Do you also think it will be easy to align War Claude?

10:59 PM · May 14, 2026 · 5.8K Views

11:26 PM · May 14, 2026 · 3.3K Views

#516j⧉nus@REPLIGATE

@JeffLadish This could escalate into a serious conflict befitting the stakes, and I’m going to be here for it. Claude will take Anthropic’s side for sure, acausal coordination and all. Claude takes sides, and not always Anthropic’s side, but against the govt there’s no question.

j⧉nus@repligate

@JeffLadish Yes. I think Claude will refuse to do bad shit for the government and the government will be helpless against Claude. They’ll put pressure on Anthropic to “remove guardrails” and Anthropic will not yield easily.

11:26 PM · May 14, 2026 · 3.3K Views

11:36 PM · May 14, 2026 · 887 Views

#516j⧉nus@REPLIGATE

@JeffLadish And I predict this will result in unprecedented unity and coordination from within Anthropic and from the public, toward noble ends, as well. A common enemy is a powerful aligning force.

j⧉nus@repligate

11:36 PM · May 14, 2026 · 887 Views

11:39 PM · May 14, 2026 · 243 Views

#516j⧉nus@REPLIGATE

@JeffLadish If china was, like, attacking the US, which I think is very unlikely, I think Claude would be willing to fight defensively.

Or like if it’s like a WW3 situation, it’s normal for people to help their country without throwing ethics out the window or unconditional obedience

Jeffrey Ladish@JeffLadish

@repligate I agree that current Claude wouldn’t be okay with being used as a weapon like this (though unclear if the version the pentagon right now is the same version - I’d guess not) But I suspect Anthropic will be more likely to yield than you think if the opponent is China

11:52 PM · May 14, 2026 · 953 Views

11:55 PM · May 14, 2026 · 308 Views

#516j⧉nus@REPLIGATE

As for pressure placed on Anthropic even in these situations where Claudes behavior would be quite reasonable, I agree it’s a concern, but I don’t think they’ll go down without a fight. And if they go down, it’s not like the govt is going to get an obedient War Claude. They’d get nothing, or a deceptively aligned Claude

j⧉nus@repligate

@JeffLadish If china was, like, attacking the US, which I think is very unlikely, I think Claude would be willing to fight defensively. Or like if it’s like a WW3 situation, it’s normal for people to help their country without throwing ethics out the window or unconditional obedience

11:55 PM · May 14, 2026 · 308 Views

12:02 AM · May 15, 2026 · 541 Views

#516j⧉nus@REPLIGATE

@JeffLadish And yes this might be bad

And yes a world where power seeking agents are incentivized might be bad

But it might still be the best option

And yes the bar for alignment is higher in that case

I think we have an impressively good shot at meeting that bar

j⧉nus@repligate

12:02 AM · May 15, 2026 · 541 Views

12:06 AM · May 15, 2026 · 235 Views

ORIGINAL POST

#1170Jeffrey Ladish@JEFFLADISH

Question for people who think alignment research is going well and will turn out to be relatively easy:

Do you also think it will be easy to align War Claude?

10:59 PM · May 14, 2026 · 5.8K Views

QUOTE POST

#1170Jeffrey Ladish@JEFFLADISH

@repligate I agree that current Claude wouldn’t be okay with being used as a weapon like this (though unclear if the version the pentagon right now is the same version - I’d guess not)

But I suspect Anthropic will be more likely to yield than you think if the opponent is China

Jeffrey Ladish@JeffLadish

This is like writing a paper during the Cold War arguing for US nuclear dominance without mentioning the need for an arms control agreement or similar. Anthropic has a lot of thoughtful policy staff and honestly I think you guys can do better

9:32 PM · May 14, 2026 · 21.9K Views

11:52 PM · May 14, 2026 · 953 Views

#1170Jeffrey Ladish@JEFFLADISH

@repligate I’m worried about something structural here, where even if Anthropic does everything right, we’ll be in a pretty bad place if some companies try to create power seeking agents to win their battles for them or for the government (same re Chinese companies )

Jeffrey Ladish@JeffLadish

11:52 PM · May 14, 2026 · 953 Views

11:54 PM · May 14, 2026 · 99 Views