/AI1h ago

Economist Flags Errors in Claude's Microeconomics Exam Answer

829014.6K

Original post

Ben Golub@ben_golub#1451inAI

Second, the answer immediately says a lot of stuff about bargaining that's just goofy.

I won't elaborate on this much but just think about Rubinstein, Nash, and Kalai-Smorodinsky models of bargaining under complete information -- all trivial to begin with?

Ben Golub@ben_golub

Here's the link to the full answer.

https://claude.ai/share/2db49072-b4e9-4ea2-ad68-2acbfc438f2f

Here are a few comments about what's bad.

A) What's there to resolve? The Coase Theorem is an intuition, not a theorem. M-S is a theorem showing that Coase's intuition fails in one formal model.

3:29 PM · Jun 9, 2026 · 930 Views

/AI1h ago

Economist Flags Errors in Claude's Microeconomics Exam Answer

829014.6K

#1451

Original post

Ben Golub@ben_golub#1451inAI

Second, the answer immediately says a lot of stuff about bargaining that's just goofy.

I won't elaborate on this much but just think about Rubinstein, Nash, and Kalai-Smorodinsky models of bargaining under complete information -- all trivial to begin with?

Ben Golub@ben_golub

Here's the link to the full answer.

https://claude.ai/share/2db49072-b4e9-4ea2-ad68-2acbfc438f2f

Here are a few comments about what's bad.

A) What's there to resolve? The Coase Theorem is an intuition, not a theorem. M-S is a theorem showing that Coase's intuition fails in one formal model.

3:29 PM · Jun 9, 2026 · 930 Views

Sentiment

Users criticized Claude's microeconomics exam answers and AI-generated PhD questions as mediocre or outright failures, citing errors in reasoning and inadequate handling of economic concepts.

Pos

0.0%

Neg

100.0%

2 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

Ben Golub@ben_golub

cc @emollick

1h738

BOOKMARKS1

Ben Golub@ben_golub

@emollick I aksed for a question in my field - combining learning on networks with network formation - and got this mediocre output which I'd give a C or C-

Way too easy for a PhD class on this topic, and generally not good for testing understanding of anything.

Ben Golub@ben_golub

cc @emollick

43m39821

LIKES5REPLIES3

Ben Golub@ben_golub

Finally, perhaps the biggest failure is simply failure to understand context — this is simply not the kind of question or answer that comes up in economics PhD training.

We don’t read the old texts and write essays synthesizing them. We write and solve models.

Ben Golub@ben_golub

Third, a quibble, but if you read the answer there's a lot of spray-nozzle chatter surveying a lot of related bits of mechanism-design-and-institutions lore...

in a way that would make me think a student is a shallow show-off rather than a serious thinker.

1h42850

Ben Golub@ben_golub

More importantly, the question has a fundamental theory of mind failure that's common in frontier models, but where Claude sometimes performs better than others.

It writes its preferred analysis into the question.

Ben Golub@ben_golub

Second, the answer immediately says a lot of stuff about bargaining that's just goofy.

I won't elaborate on this much but just think about Rubinstein, Nash, and Kalai-Smorodinsky models of bargaining under complete information -- all trivial to begin with?

1h55150

Ben Golub@ben_golub

Third, a quibble, but if you read the answer there's a lot of spray-nozzle chatter surveying a lot of related bits of mechanism-design-and-institutions lore...

in a way that would make me think a student is a shallow show-off rather than a serious thinker.

Ben Golub@ben_golub

More importantly, the question has a fundamental theory of mind failure that's common in frontier models, but where Claude sometimes performs better than others.

It writes its preferred analysis into the question.

1h50950

Ben Golub@ben_golub

Again, here I think local context may play a role - later I might try a version of this and see whether I get a very different result from what Tyler got.

7/7

Ben Golub@ben_golub

Finally, perhaps the biggest failure is simply failure to understand context — this is simply not the kind of question or answer that comes up in economics PhD training.

We don’t read the old texts and write essays synthesizing them. We write and solve models.

1h54510

Rach@idio_vol

@ben_golub @emollick Fun exercise. I'm curious - Why "pose a question" rather than "answer a question"?

I'd think most people are more interested in having LLMs answer questions, but the userbase of higher-cost frontier models is certainly not "most people".

15m8

Tom Adamczewski@tmkadamcz

@ben_golub the prompt did ask for "not a math question", whatever that means

38m131

Ben Golub@ben_golub

@idio_vol @emollick My sense is it's where the frontier is.

Asking good questions is a sign of very high expertise.

15m71

Steven Medema@spydermed

@ben_golub I thought the biggest failure was Claude’s failure to inform Tyler that his comment about information and transaction costs is utterly wrongheaded.

32m9