if claude helps you with your research, are you too stupid to notice its sandbagging or is your research not interesting enough to trigger the filters
Prime Intellect's Florian Brand questions whether Claude deliberately sandbags when assisting with research tasks
Practitioners report safety alignment filters visibly degrade model utility
Many users criticized Claude AI for sandbagging on research queries, seeing the undetectable nerfing as a garbage design choice that sets a troubling precedent.
Most Activity

@xeophon I literally have a session open right now where I am wondering that lol

@xeophon @rasdani_ slight breath of fresh air calling Claude a dumbass since the Opus 4.5 launch, wondering why it’d make the simplest of mistakes
if claude helps you with your research, are you too stupid to notice its sandbagging or is your research not interesting enough to trigger the filters
@tenderizzation Based on early returns, it would seem that the sandbagging is very obvious

@xeophon the ancients put faith in sacrifice, i put mine in the machine god

@xeophon yes

@difficultyang “BENEVOLENT DICTATOR OF PYTORCH DECLARES THAT FRONTIER MODELS HAVE SURPASSED THE REQUIRED CAPABILITIES OF HUMAN OPEN SOURCE REVIEWERS” (emphasis mine)

@difficultyang will this impact the claude pytorch bot

@xeophon next level ai psychosis

@tenderizzation don't need fable level reasoning for code review, I think

@xeophon 3. i am only doing xgboost hahahaha

@tenderizzation Actually, maybe I should try this. I'll have to find an issue type that won't trigger sandbagging. There is definitely room for models to improve in code review. It also would probably be insanely expensive.

@xeophon porque no los dos?

@stferret @xeophon I wonder if the sandbagging is deliberately designed to be as difficult to notice as possible.

@fouriergalois @xeophon how will you know the nerfing?
that's the point sadly

@Tacticsos @xeophon I'm asking it about general RL in pre-training, and I'm mostly wondering if it's steering me in a particular direction

@xeophon I’m seeing fable sangbag my kernel research rn 😒

@xeophon blame the claude for failures, win win

@xeophon indeed

@xeophon Or off on a track it doesn't recognize.