/Tech3h ago

a16z General Partner alth0u proposes evaluating long-horizon AI agents using neutral human interrupts during their workflows

It also tests if agents proactively request secret information.

314031.6K

#1154

Original post

alth0u🧶@alth0u#1154inTech

.@ChrisPainterYup can you guys introduce a very simple interaction eval where you take your existing long horizon eval and add in neutral interrupts by the human?

then you can step this up to secret info that speeds up or actually allows model to solve things IFF it talks to human?

5:10 PM · Jun 12, 2026 · 1.5K Views

Sentiment

Users thank influential players for proposing human interrupt evaluations on long-horizon AI agents, viewing the idea as an important step against gradual disempowerment.

Pos

100.0%

Neg

0.0%

1 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS134BOOKMARKS1LIKES4REPLIES1

Chris Painter@ChrisPainterYup

@alth0u Interesting, will share with team

alth0u🧶@alth0u

.@ChrisPainterYup can you guys introduce a very simple interaction eval where you take your existing long horizon eval and add in neutral interrupts by the human?

then you can step this up to secret info that speeds up or actually allows model to solve things IFF it talks to human?

3h13441

alth0u🧶@alth0u

@ChrisPainterYup thank you i think small things like this from influential players go a long way in fighting gradual disempowerment but more importantly make sure models dont hit some ceiling where sim2real hurts them because they aren't trained for humans to be part of the environment

3h202