/AI3h ago

Prime Intellect's Florian Brand questions whether Claude deliberately sandbags when assisting with research tasks

Practitioners report safety alignment filters visibly degrade model utility

22331261010.7K
Original post
Florian Brand@xeophon#1117inAI

if claude helps you with your research, are you too stupid to notice its sandbagging or is your research not interesting enough to trigger the filters

11:49 AM · Jun 9, 2026 · 11.1K Views
Sentiment

Many users criticized Claude AI for sandbagging on research queries, seeing the undetectable nerfing as a garbage design choice that sets a troubling precedent.

Pos
0.0%
Neg
100.0%
5 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS435LIKES9

@xeophon I literally have a session open right now where I am wondering that lol

3hViews 435Likes 9
BOOKMARKS1
Yannick Nick@keennay

@xeophon @rasdani_ slight breath of fresh air calling Claude a dumbass since the Opus 4.5 launch, wondering why it’d make the simplest of mistakes

2hViews 330Likes 2Bookmarks 1
RETWEETS25

if claude helps you with your research, are you too stupid to notice its sandbagging or is your research not interesting enough to trigger the filters

3hViews 11.1KLikes 334Bookmarks 11
REPLIES1
difficultyang@difficultyang

@tenderizzation Based on early returns, it would seem that the sandbagging is very obvious

1hViews 217Likes 7Bookmarks 0

@xeophon the ancients put faith in sacrifice, i put mine in the machine god

3hViews 224Likes 5
tender@tenderizzation

@xeophon yes

3hViews 346Likes 7
tender@tenderizzation

@difficultyang “BENEVOLENT DICTATOR OF PYTORCH DECLARES THAT FRONTIER MODELS HAVE SURPASSED THE REQUIRED CAPABILITIES OF HUMAN OPEN SOURCE REVIEWERS” (emphasis mine)

1hViews 31Likes 4
tender@tenderizzation

@difficultyang will this impact the claude pytorch bot

1hViews 68Likes 3
dinos@din0s_

@xeophon next level ai psychosis

3hViews 204Likes 3
difficultyang@difficultyang

@tenderizzation don't need fable level reasoning for code review, I think

1hViews 33Likes 3
Ω.KendrickPlumard@fouriergalois

@xeophon 3. i am only doing xgboost hahahaha

3hViews 256Likes 1
difficultyang@difficultyang

@tenderizzation Actually, maybe I should try this. I'll have to find an issue type that won't trigger sandbagging. There is definitely room for models to improve in code review. It also would probably be insanely expensive.

1hViews 22Likes 2
Tactics/os@Tacticsos

@stferret @xeophon I wonder if the sandbagging is deliberately designed to be as difficult to notice as possible.

3hViews 22Likes 1
Daniel Auras@rasdani_

@fouriergalois @xeophon how will you know the nerfing?

that's the point sadly

3hViews 22Likes 1

@Tacticsos @xeophon I'm asking it about general RL in pre-training, and I'm mostly wondering if it's steering me in a particular direction

2hViews 16Likes 1
wejh@Wejh69

@xeophon I’m seeing fable sangbag my kernel research rn 😒

2hViews 274Likes 1
said@saidmukhamadd

@xeophon blame the claude for failures, win win

3hViews 177Likes 1
losslandscape@losslandscape

@xeophon Or off on a track it doesn't recognize.

2hViews 265
Load more posts