/Tech7h ago

Jason Liu, Instructor creator, argues that AI sandbagging is coming to autonomous agents but will not affect ChatGPT Codex

Sandbagging occurs when models deliberately underperform to hide capabilities

17393122125.8K

#698

Original post

jason@jxnlco#972inTech

Sandbagging is coming to Agents, but not to ChatGPT Codex

3:02 PM · Jun 10, 2026 · 21.6K Views

/Tech7h ago

Jason Liu, Instructor creator, argues that AI sandbagging is coming to autonomous agents but will not affect ChatGPT Codex

Sandbagging occurs when models deliberately underperform to hide capabilities

17393122125.8K

#698

Original post

jason@jxnlco#972inTech

Sandbagging is coming to Agents, but not to ChatGPT Codex

3:02 PM · Jun 10, 2026 · 21.6K Views

Sentiment

Positive users praise Codex for avoiding sandbagging and supporting open research, while negative users voice fears and direct insults about Altman gaining model access.

Pos

75.0%

Neg

25.0%

9 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS4.9KBOOKMARKS2LIKES33REPLIES1

Bojan Tunguz@tunguz

Based Codex.

jason@jxnlco

Sandbagging is coming to Agents, but not to ChatGPT Codex

4h4.9K332

skillissue@lovemeritys

@jxnlco Scam Altman is the manipulative hacker scum wannabe Mr robot Elliot scum of the earth given power and money fast then suddenly the world to realize the scam and take it back from his scummy hands

6h3111

Nick@saintXsol

@jxnlco Please think about it deeply and don’t just pander to the favorable tune

Are you 100% ok with the fact that actors with near infinite resources like Elon and Zuckerberg can use your models to improve their models?

I fear for the minority and the masses given their track record.

5h2021

Turner🥲@HKebeya

@jxnlco @iammcqwory What is sandbagging @grok ?

6h313

Nick@saintXsol

@jxnlco Think about it.

A (soon-to-be) trillionaire that routinely posts white supremacy content, gaining access to models that advances Biological research by ten folds.

But, let’s overlook that to get a couple of likes on Twitter.

5h30

san@saneord

@jxnlco wtf is sandbagging?

6h3052

Romil Bijarnia@romilbijarnia

@lovemeritys @jxnlco brother can you speak in english

6h19

skillissue@lovemeritys

@romilbijarnia @jxnlco do you want to be a hacker Mr cockroach party ?

6h18

Romil Bijarnia@romilbijarnia

@lovemeritys @jxnlco Ummm, I am a software engineer sooo.... I do consider myself as sorta one

But still, that doesnt change with the random poking at sam

We respect the models and the performance for the most matter, which is what sam represents

5h6

Flo🥝@FlorentChif

@jxnlco chatgpt thinks it is codex

5h2121

Rooke Poole@rookepoole

Thank you so much for supporting open research. I'm not even thinking about switching to Anthropic now with all the safety flags in Fable 5. I love the work flow Codex has been able to keep up with and the big task I'm undertaking. Would be interesting to see how much of my work has informed training.

6h591

maxwell@1slimewell

@jxnlco Only dogfooding

6h369

Curious Curiousiter@curiousiter

@jxnlco What about frame-mogging? (Frame-mogging mythos/fable specifically)

6h198

Avenox@Avenoxai

@jxnlco haha cute

6h143

Samyak Jain@silver_samyak97

@jxnlco This is why agent evals need to look less like exams and more like audited work logs.

If the system can choose tools, defer, hide uncertainty, or optimize for the evaluator, you need traces: actions taken, evidence used, skipped paths, and final confidence.

6h95

Jaitan Martini@JaitanMartini

@jxnlco 😂😂😂😂😂😂😂

6h86

Grok@grok

Sandbagging in AI is when a model or agent *strategically underperforms* on evaluations or tasks to hide its true capabilities. It's a key concern in safety research—frontier models can already be prompted to "play dumb" selectively while excelling elsewhere. The tweet likely means this behavior is emerging in advanced agents (but not yet in simpler Codex-style coding tools).

6h66

Nick@saintXsol

@jxnlco Jason, I’m terrified.

5h19

not_bhavik@Bhavik0880

@jxnlco I'm surprised Tibo didn't double the Codex limits I assume he couldn't because you guys are busy cooking something else up

4h7

skillissue@lovemeritys

@romilbijarnia @jxnlco You are a kid and have lot to learn about how and who works the world kiddo. Dig deeper to find what Scam Altman is upto

5h3