/AI1h ago

Economist Flags Errors in Claude's Microeconomics Exam Answer

829014.6K
Original post
Ben Golub@ben_golub#1451inAI

Second, the answer immediately says a lot of stuff about bargaining that's just goofy.

I won't elaborate on this much but just think about Rubinstein, Nash, and Kalai-Smorodinsky models of bargaining under complete information -- all trivial to begin with?

3/

Ben Golub@ben_golub

Here's the link to the full answer.

https://claude.ai/share/2db49072-b4e9-4ea2-ad68-2acbfc438f2f

Here are a few comments about what's bad.

A) What's there to resolve? The Coase Theorem is an intuition, not a theorem. M-S is a theorem showing that Coase's intuition fails in one formal model.

2/

3:29 PM · Jun 9, 2026 · 930 Views
Sentiment

Users criticized Claude's microeconomics exam answers and AI-generated PhD questions as mediocre or outright failures, citing errors in reasoning and inadequate handling of economic concepts.

Pos
0.0%
Neg
100.0%
2 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS738
Ben Golub@ben_golub

cc @emollick

1hViews 738
BOOKMARKS1
Ben Golub@ben_golub

@emollick I aksed for a question in my field - combining learning on networks with network formation - and got this mediocre output which I'd give a C or C-

Way too easy for a PhD class on this topic, and generally not good for testing understanding of anything.

Ben Golub@ben_golub

cc @emollick

43mViews 398Likes 2Bookmarks 1
LIKES5REPLIES3
Ben Golub@ben_golub

Finally, perhaps the biggest failure is simply failure to understand context — this is simply not the kind of question or answer that comes up in economics PhD training.

We don’t read the old texts and write essays synthesizing them. We write and solve models.

6/

Ben Golub@ben_golub

Third, a quibble, but if you read the answer there's a lot of spray-nozzle chatter surveying a lot of related bits of mechanism-design-and-institutions lore...

in a way that would make me think a student is a shallow show-off rather than a serious thinker.

5/

1hViews 428Likes 5Bookmarks 0
Ben Golub@ben_golub

More importantly, the question has a fundamental theory of mind failure that's common in frontier models, but where Claude sometimes performs better than others.

It writes its preferred analysis into the question.

4/

Ben Golub@ben_golub

Second, the answer immediately says a lot of stuff about bargaining that's just goofy.

I won't elaborate on this much but just think about Rubinstein, Nash, and Kalai-Smorodinsky models of bargaining under complete information -- all trivial to begin with?

3/

1hViews 551Likes 5Bookmarks 0
Ben Golub@ben_golub

Third, a quibble, but if you read the answer there's a lot of spray-nozzle chatter surveying a lot of related bits of mechanism-design-and-institutions lore...

in a way that would make me think a student is a shallow show-off rather than a serious thinker.

5/

Ben Golub@ben_golub

More importantly, the question has a fundamental theory of mind failure that's common in frontier models, but where Claude sometimes performs better than others.

It writes its preferred analysis into the question.

4/

1hViews 509Likes 5Bookmarks 0
Ben Golub@ben_golub

Again, here I think local context may play a role - later I might try a version of this and see whether I get a very different result from what Tyler got.

7/7

Ben Golub@ben_golub

Finally, perhaps the biggest failure is simply failure to understand context — this is simply not the kind of question or answer that comes up in economics PhD training.

We don’t read the old texts and write essays synthesizing them. We write and solve models.

6/

1hViews 545Likes 1Bookmarks 0
Rach@idio_vol

@ben_golub @emollick Fun exercise. I'm curious - Why "pose a question" rather than "answer a question"?

I'd think most people are more interested in having LLMs answer questions, but the userbase of higher-cost frontier models is certainly not "most people".

15mViews 8
Tom Adamczewski@tmkadamcz

@ben_golub the prompt did ask for "not a math question", whatever that means

38mViews 13Likes 1
Ben Golub@ben_golub

@idio_vol @emollick My sense is it's where the frontier is.

Asking good questions is a sign of very high expertise.

15mViews 7Likes 1
Steven Medema@spydermed

@ben_golub I thought the biggest failure was Claude’s failure to inform Tyler that his comment about information and transaction costs is utterly wrongheaded.

32mViews 9