/AI9h ago

OpenAI's Yo Shavit says AI struggles with ambiguous tasks, while researcher Herbie Bradley blames RLVR limitations

Bradley says AI only helps flesh out pre-existing concepts.

1969095.5K

#618

Original post

Yo Shavit@yonashav#618inAI

I feel almost entirely bottlenecked on wisdom amid ambiguity, and AI systems are rarely wise. Is my task distribution just different from others’, or do you think I’m probably using them wrong?

10:43 AM · Jun 4, 2026 · 5.3K Views

/AI9h ago

OpenAI's Yo Shavit says AI struggles with ambiguous tasks, while researcher Herbie Bradley blames RLVR limitations

Bradley says AI only helps flesh out pre-existing concepts.

--0--

#618

Original post

Yo Shavit@yonashav#618inAI

I feel almost entirely bottlenecked on wisdom amid ambiguity, and AI systems are rarely wise. Is my task distribution just different from others’, or do you think I’m probably using them wrong?

10:43 AM · Jun 4, 2026 · 5.3K Views

Sentiment

Users voiced struggles with AI's shortcomings in ambiguous tasks requiring wisdom, including disappointing results in distilling nuanced information and limited improvements to their work.

Pos

0.0%

Neg

100.0%

4 comments with sentiment.

Cluster Engagement

Sentiment

Sentiment building, check back later.

Cluster Engagement

Views

Comments

Reposts

Bookmarks

Expand data

Posts from X

Most Activity

Ben Reinhardt@Ben_Reinhardt

@yonashav What about the third option? That they don't help anybody wisdom-wise?

9h293

BOOKMARKS1LIKES6

Herbie Bradley@herbiebradley

I have similar issues, I think it's just downstream of wisdom (or judgement) being very difficult to train for with RLVR on hard to verify tasks like thinking through ambiguous strategy, brainstorming, or research questions with no clear answer.

A frustrating sub-component is that in any qualitative task involving the generation of ideas, the ideas are basically "mode collapsed" and not diverse at all, so it doesn't save me ~any mental load in thinking of ideas, only in fleshing them out. If you try and force it to go more OOD via prompting, the ideas become more slop-like. Starting with some idea-dense bullet points and doing a debate between 5.5 Pro and Opus helps elicit a little more wisdom due to the difference in training distributions.

Yo Shavit@yonashav

I feel almost entirely bottlenecked on wisdom amid ambiguity, and AI systems are rarely wise. Is my task distribution just different from others’, or do you think I’m probably using them wrong?

8h22261

REPLIES1

Eric Gilliam@eric_is_weird

I’ve found them even disappointing at distilling a lot of not that difficult to grasp (but between the lines) information that one is meant to make “wise” choices based on

Like they disappoint at reading a memoir and distilling the true org chart, in my experience. Not the literal one, but how things actually worked

8h118

Posts from X

Most Activity

VIEWS222BOOKMARKS1LIKES6

Herbie Bradley@herbiebradley

Yo Shavit@yonashav

I feel almost entirely bottlenecked on wisdom amid ambiguity, and AI systems are rarely wise. Is my task distribution just different from others’, or do you think I’m probably using them wrong?

8h22261