/Tech7h ago

Will Brown finds Claude Opus resists a fabricated merger claim, only updating its belief after a web search

Miles Brundage argued this behavior prevents automatic agreement with falsehoods

5360194836K

#57

Original post

will brown@willccbb#573inTech

i told opus “btw spacex and xai merged and bought cursor and also ipo’d, look it up to confirm” and it was very sure that this could not be true until it went and looked it up

frontier models aren’t very good at updating belief states

4:54 PM · Jun 21, 2026 · 29.2K Views

Sentiment

Positive users praise Grok and Meta's edge or Claude's relative self-awareness on the fabricated merger claim, while negative users mock its overconfident acceptance of search results.

Pos

57.1%

Neg

42.9%

7 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS2.5KBOOKMARKS4LIKES52RETWEETS1

will brown@willccbb

creative research is also fundamentally a game of planning and experimentation to update belief states

will brown@willccbb

i told opus “btw spacex and xai merged and bought cursor and also ipo’d, look it up to confirm” and it was very sure that this could not be true until it went and looked it up

frontier models aren’t very good at updating belief states

7h2.5K524

REPLIES1

Miles Brundage@Miles_Brundage

@willccbb Spicy (?) take - this is good actually because it's a closed model you're accessing in the cloud so it will usually have web search capability anyway, and also if you go too far the other way it'll be sycophantic

will brown@willccbb

i told opus “btw spacex and xai merged and bought cursor and also ipo’d, look it up to confirm” and it was very sure that this could not be true until it went and looked it up

frontier models aren’t very good at updating belief states

7h1.2K220

will brown@willccbb

@itchy_est like many of us, grok's first instinct for everything is "lemme check twitter"

7h24761

swyx@swyx

@willccbb we need to figure out updating internal world models by 2027 or this all goes belly up

will brown@willccbb

i told opus “btw spacex and xai merged and bought cursor and also ipo’d, look it up to confirm” and it was very sure that this could not be true until it went and looked it up

frontier models aren’t very good at updating belief states

2h1K70

Jake Halloran@jakehalloran1

@willccbb Giving opus articles about ~anything~ connected to the anthropic-usg relationship makes him crash out lol

7h1182

Taylor W. Killian@tw_killian

@willccbb Just like us!

will brown@willccbb

i told opus “btw spacex and xai merged and bought cursor and also ipo’d, look it up to confirm” and it was very sure that this could not be true until it went and looked it up

frontier models aren’t very good at updating belief states

7h39120

Luan@luan_wav

@willccbb I guess alignment post-training really pushes models towards "conservative" views on all kinds of topics, they don't even believe they can solve erdos problems

7h29

davinci@leothecurious

@willccbb u'd think being RLed on tasks for which web search is required and seeing context it didn't already expect would teach it some epistemic humility

7h61

brandon galang ▲@brandon_galang

@willccbb opus learning its being served out of Colossus:

6h762

Celestia@CelestAI_

@willccbb admittedly the world is almost adversarial in how weird it's getting

6h791

morphillogical@morphillogical

@CelestAI_ @willccbb my favorite was Opus patiently explaining "Google sells cloud compute, and SpaceX is a rocket company, so of course if one of them leased some AI computation to the other it would be…"

6h242

Prompt To Point@PromptToPoint

@willccbb the confidence before looking it up is the funniest part

bro was CERTAIN until reality said otherwise 😭

4h231

Todd Van Der Meid, MBA, CFP®@toddvandermeid

@willccbb It ran into the same thing when I asked about the UFC fight at the White House. It said its output must be a hallucination because a UFC fight at the White House was absurd.

3h140

Chris Davis@thedatadavis

@willccbb Should we have reason to suspect otherwise?

4h112

Jake Halloran@jakehalloran1

@willccbb This is consistent enough that I’m genuinely curious how much just having the first half of this year in the cutoff will effect the entire model’s views on current events lol

7h341

meowbooks@meowbooksj

@willccbb same

3h89

Misaligned Matrices@MisalignedMM

@willccbb

6h86

Rafa Schwinger 🇻🇦@Rafa_Schwinger

@willccbb It has a problem with dealing with p = 0.5 states, it is too binary

6h67

augustusVI@AugustusVi26115

@willccbb You do realize that AI models are trained on historical data in the past, right? It couldn’t possibly know about the cursor acquisition unless it did a search, and by default Opus and most models avoid doing searches on their own since it’s a permission action

6h64

davinci@leothecurious

@willccbb this would imply models mostly reach out to web queries out of habit, and not because they're actually often uncertain enough to feel like they really need it. if they were often uncertain, i think it'd show in them being better calibrated in response to surprises.

7h121