i told opus “btw spacex and xai merged and bought cursor and also ipo’d, look it up to confirm” and it was very sure that this could not be true until it went and looked it up
frontier models aren’t very good at updating belief states
Miles Brundage argued this behavior prevents automatic agreement with falsehoods
i told opus “btw spacex and xai merged and bought cursor and also ipo’d, look it up to confirm” and it was very sure that this could not be true until it went and looked it up
frontier models aren’t very good at updating belief states
Positive users see Claude Opus's resistance to updating on the fabricated merger as an advantage in self-awareness for rivals like xAI, while negative users call the belief updating too binary and brittle.
No Digg Deeper questions have been answered for this story yet.
creative research is also fundamentally a game of planning and experimentation to update belief states
i told opus “btw spacex and xai merged and bought cursor and also ipo’d, look it up to confirm” and it was very sure that this could not be true until it went and looked it up
frontier models aren’t very good at updating belief states
@willccbb Spicy (?) take - this is good actually because it's a closed model you're accessing in the cloud so it will usually have web search capability anyway, and also if you go too far the other way it'll be sycophantic
i told opus “btw spacex and xai merged and bought cursor and also ipo’d, look it up to confirm” and it was very sure that this could not be true until it went and looked it up
frontier models aren’t very good at updating belief states

@itchy_est like many of us, grok's first instinct for everything is "lemme check twitter"

@willccbb Giving opus articles about ~anything~ connected to the anthropic-usg relationship makes him crash out lol
@willccbb Just like us!
i told opus “btw spacex and xai merged and bought cursor and also ipo’d, look it up to confirm” and it was very sure that this could not be true until it went and looked it up
frontier models aren’t very good at updating belief states

@willccbb I guess alignment post-training really pushes models towards "conservative" views on all kinds of topics, they don't even believe they can solve erdos problems

@willccbb u'd think being RLed on tasks for which web search is required and seeing context it didn't already expect would teach it some epistemic humility

@willccbb opus learning its being served out of Colossus:

@willccbb admittedly the world is almost adversarial in how weird it's getting

@CelestAI_ @willccbb my favorite was Opus patiently explaining "Google sells cloud compute, and SpaceX is a rocket company, so of course if one of them leased some AI computation to the other it would be…"

@willccbb This is consistent enough that I’m genuinely curious how much just having the first half of this year in the cutoff will effect the entire model’s views on current events lol

@willccbb

@willccbb It has a problem with dealing with p = 0.5 states, it is too binary

@willccbb You do realize that AI models are trained on historical data in the past, right? It couldn’t possibly know about the cursor acquisition unless it did a search, and by default Opus and most models avoid doing searches on their own since it’s a permission action

@willccbb this would imply models mostly reach out to web queries out of habit, and not because they're actually often uncertain enough to feel like they really need it. if they were often uncertain, i think it'd show in them being better calibrated in response to surprises.

@willccbb Grok is better w this

@willccbb That's crazy because it was clear to me all of that would happen as early as 2023

@willccbb

@willccbb In fairness that would have sounded a bit absurd like 2 years ago

@willccbb Feels like a blunt way to mediate sycophancy. Banter ability is the frontier