/Tech1h ago

Closed-Source AI Models Use Hidden Filters to Degrade Output Quality

915278.7K
Original post
⿻ Andrew Trask@iamtrask#366inTech

FYI: all closed-source AI products degrade performance based on the prompt w/o telling the user. There are layers of "filters" on the output.

Most obvious is a "recitation filter"... checks every LLM output to prevent an AI product from accidentally outputting exact copies of information from its training data.

The problem: exact-quoting high quality sources is often the most accurate response to give... but it can be a liability for closed-source products. So the filter catches it... and a less-exact output is produced instead.

Open source models don't do this.

This is also why attribution-based control is such a crucial feature to long-term trustworthy AI. Attribution (and corresponding credit/payment layers) would allow frontier AI systems to more reliably exact-quote sources. Important research area.

Max Zeff@ZeffMax

NEW: Anthropic is walking back Claude Fable 5's policy to covertly degrade performance for competing AI researchers, after facing fierce backlash.

“We’re changing Fable 5’s safeguards for frontier LLM development to make them visible,” Anthropic tells WIRED. “We made the wrong tradeoff and we apologize for not getting the balance right.”

8:29 AM · Jun 11, 2026 · 7K Views
Sentiment

Many users criticize closed-source AI labs for hiding backend filters and secret rules that throttle outputs without transparency or proof.

Pos
0.0%
Neg
100.0%
4 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS201LIKES1REPLIES1

There are also a myriad of caches and other features which give a kind of... quality vs (internal) cost tradeoff. These can be tuned at any time.

FYI: all closed-source AI products degrade performance based on the prompt w/o telling the user. There are layers of "filters" on the output.

Most obvious is a "recitation filter"... checks every LLM output to prevent an AI product from accidentally outputting exact copies of information from its training data.

The problem: exact-quoting high quality sources is often the most accurate response to give... but it can be a liability for closed-source products. So the filter catches it... and a less-exact output is produced instead.

Open source models don't do this.

This is also why attribution-based control is such a crucial feature to long-term trustworthy AI. Attribution (and corresponding credit/payment layers) would allow frontier AI systems to more reliably exact-quote sources. Important research area.

1hViews 201Likes 1Bookmarks 0

You can read more about related safety filters in @AdaLovelaceInst report: https://www.adalovelaceinstitute.org/report/under-the-radar/

There are also a myriad of caches and other features which give a kind of... quality vs (internal) cost tradeoff. These can be tuned at any time.

1hViews 93Likes 1Bookmarks 0

@AdaLovelaceInst - Anthropic on Filters: https://support.claude.com/en/articles/8106465-our-approach-to-user-safety - Microsoft on Filters: https://learn.microsoft.com/en-us/azure/foundry-classic/foundry-models/concepts/content-filter

1hViews 55

@AdaLovelaceInst - OpenAI Filters: https://arxiv.org/html/2509.13608v1

1hViews 39

@iamtrask they put the controls in the backend so nobody could prove the outputs were being throttled

calling it a filter is generous

1hViews 5
moherent@feed_sam

@iamtrask Almost every single action from the labs enforces distrust, misinformation, and control.

1hViews 1
Invincible@InvincibleEdge

@iamtrask and people still wonder why open source models keep closing the gap

1h
Blissy@BlissyOnX

@iamtrask the irony of them "making safeguards visible" while the whole system is built on invisible throttling is wild

1h
Rugbist@rugbist_

@iamtrask ngl the whole system runs on secret rules nobody agreed to

but we just expected them to tell us when they change em

1h