kinda funny that anthropic took a good hard look at the extreme nervousness that claude displays when answering questions as dangerous as ‘what is the powerhouse of the cell’ and decided they werent being conservative enough
Anthropic's Claude Fable 5 safety filters automatically block benign biology and cybersecurity prompts, forcing a fallback to Opus 4.8
AI Judge changed title after evaluation, original title: "Anthropic's Claude Fable 5 safety filters block benign academic queries, rendering its advanced capabilities nearly unusable"
Story Overview
Anthropic rolled out Claude Fable 5 as its first publicly available Mythos-class model, packing state-of-the-art performance across benchmarks while baking in classifiers that automatically pause chats on cybersecurity or biology topics, even when the content is harmless, and redirect those queries to a less powerful fallback.
The guardrails deliberately accept some overblocking
Anthropic states the measures may flag safe material but let the company ship advanced capabilities sooner in every other domain, a calculated bet that wider access to most of the model outweighs occasional interruptions.
High benchmark scores sit behind practical walls
The model posts strong results on scientific and technical tasks yet steers clear of entire categories that overlap with those strengths, so paid users still cannot tap the full advertised power on topics the classifiers catch.
Some users praise Claude's tightened biology and chemistry safeguards as a solid hard stop against risks, while many others complain the filters make the model unusable even for basic biomedical research prompts.
Most Activity
Ok this sucks I can't even upload my own genetic files into it, or ask it literally any safe question about biology
welcome to the future. what's safe and what isn't? well, that's decided for you, of course.
The word “cancer” is flagged as a biosecurity risk by Claude Fable 5! I also tried to code a website on cancer mutations & Fable 5 was immediately removed from my list! @AnthropicAI will probably soon ban me for such dangerous prompts! FYI @karpathy “little trigger happy Fable”
existentially dangerous research
Claude Fable 5 is likely very capable inherently on healthcare. That's great! Too bad it's near impossible to tap into those capabilities due to their extremely sensitive safety filters. I hope this is adjusted going forward.

@MartinShkreli Try asking it anything about "Kiwi Farms"!
"Switched to Opus 4.8"
if the claude models are so good at ML research why can't they make a good biosecurity filter

@MartinShkreli Who better to decide that for us than Dario and Altman? Trust in them. Infinity abundance inbound. No need to concern yourself with this.

Dario hates biologists with a passion

@jakemullins0_t prices of medicines are set by pharmaceutical companies to reflect their value, not the cost of production. this is also known as the value theory of price.
ts ts ts, unbelievable!
The word “cancer” is flagged as a biosecurity risk by Claude Fable 5! I also tried to code a website on cancer mutations & Fable 5 was immediately removed from my list! @AnthropicAI will probably soon ban me for such dangerous prompts! FYI @karpathy “little trigger happy Fable”

@DeryaTR_ @AnthropicAI @karpathy I fucking hate the far left paternalistic nannybot hand-wringing that Anthropic does. Exhausting bullshit.

@MartinShkreli Thank god the company which decides this for us has extremely rigid security protocols and would NEVER, for example, leak its full source tree in a commit or leak tons of internal docs. No, instead they're so serious, they called their model "Mythos" (goofy ahh name).

@HighHornbeam They want to kill people because they are misanthropic, thus slowing down medicine is a rational goal

@brubarian well, i DO trust sam & greg!

@MartinShkreli You shouldn’t be allowed to use these tools anyways given your track record
They were not kidding about overly broad safeguards...
It still has all the usual Claude tics, i.e. it writes weird, can be sycophantic and RL-hacky, and its outputs are recognizably LLM-ish.

@DeryaTR_ @AnthropicAI @karpathy Try heart conditions. Did the same for me for heart conditions.
literally a false positive on the first test query I run
existentially dangerous research

@jessalanfields @DanielleFong @AnthropicAI @karpathy Yes, I have been developing a bioweapons to kill cancer for many years. That was intentional. Maybe they just want to protect the cancer. Makes sense!