/Tech1d ago

Economist Flags Errors in Claude's Microeconomics Exam Answer

1211441218.6K
Original post
Ben Golub@ben_golub#1490inTech

Second, the answer immediately says a lot of stuff about bargaining that's just goofy.

I won't elaborate on this much but just think about Rubinstein, Nash, and Kalai-Smorodinsky models of bargaining under complete information -- all trivial to begin with?

3/

Ben Golub@ben_golub

Here's the link to the full answer.

https://claude.ai/share/2db49072-b4e9-4ea2-ad68-2acbfc438f2f

Here are a few comments about what's bad.

A) What's there to resolve? The Coase Theorem is an intuition, not a theorem. M-S is a theorem showing that Coase's intuition fails in one formal model.

2/

3:29 PM · Jun 9, 2026 · 4.1K Views
Sentiment

Users criticized Claude's microeconomics exam answers and AI-generated PhD questions as mediocre or outright failures, citing errors in reasoning and inadequate handling of economic concepts.

Pos
0.0%
Neg
100.0%
2 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS3.1K
Ben Golub@ben_golub

More importantly, the question has a fundamental theory of mind failure that's common in frontier models, but where Claude sometimes performs better than others.

It writes its preferred analysis into the question.

4/

Ben Golub@ben_golub

Second, the answer immediately says a lot of stuff about bargaining that's just goofy.

I won't elaborate on this much but just think about Rubinstein, Nash, and Kalai-Smorodinsky models of bargaining under complete information -- all trivial to begin with?

3/

1dViews 3.1KLikes 21Bookmarks 0
BOOKMARKS9
Ben Golub@ben_golub

@emollick I aksed for a question in my field - combining learning on networks with network formation - and got this mediocre output which I'd give a C or C-

Way too easy for a PhD class on this topic, and generally not good for testing understanding of anything.

Ben Golub@ben_golub

cc @emollick

23hViews 2.9KLikes 19Bookmarks 9
LIKES30RETWEETS1REPLIES4
Ben Golub@ben_golub

Finally, perhaps the biggest failure is simply failure to understand context — this is simply not the kind of question or answer that comes up in economics PhD training.

We don’t read the old texts and write essays synthesizing them. We write and solve models.

6/

Ben Golub@ben_golub

Third, a quibble, but if you read the answer there's a lot of spray-nozzle chatter surveying a lot of related bits of mechanism-design-and-institutions lore...

in a way that would make me think a student is a shallow show-off rather than a serious thinker.

5/

1dViews 3KLikes 30Bookmarks 2
Ben Golub@ben_golub

Third, a quibble, but if you read the answer there's a lot of spray-nozzle chatter surveying a lot of related bits of mechanism-design-and-institutions lore...

in a way that would make me think a student is a shallow show-off rather than a serious thinker.

5/

Ben Golub@ben_golub

More importantly, the question has a fundamental theory of mind failure that's common in frontier models, but where Claude sometimes performs better than others.

It writes its preferred analysis into the question.

4/

1dViews 2.9KLikes 16Bookmarks 1
Ben Golub@ben_golub

Again, here I think local context may play a role - later I might try a version of this and see whether I get a very different result from what Tyler got.

7/7

Ben Golub@ben_golub

Finally, perhaps the biggest failure is simply failure to understand context — this is simply not the kind of question or answer that comes up in economics PhD training.

We don’t read the old texts and write essays synthesizing them. We write and solve models.

6/

1dViews 2.6KLikes 6Bookmarks 0
Rach@idio_vol

@ben_golub @emollick Fun exercise. I'm curious - Why "pose a question" rather than "answer a question"?

I'd think most people are more interested in having LLMs answer questions, but the userbase of higher-cost frontier models is certainly not "most people".

23hViews 8
Tom Adamczewski@tmkadamcz

@ben_golub the prompt did ask for "not a math question", whatever that means

23hViews 13Likes 1
Ben Golub@ben_golub

@idio_vol @emollick My sense is it's where the frontier is.

Asking good questions is a sign of very high expertise.

23hViews 7Likes 1
Steven Medema@spydermed

@ben_golub I thought the biggest failure was Claude’s failure to inform Tyler that his comment about information and transaction costs is utterly wrongheaded.

23hViews 9