/Tech11h ago

Danielle Fong says biological and cyber risk classifiers perform poorly because they do not use advanced Fable-class models

Rohit Krishnan flagged the performance gap in threat detection.

2579318.7K

#1002

Original post

rohit@krishnanrohit#1210inTech

How should we update on the fact that while Fable is so good the classifier to detect bio/ cyber/ AI is so bad?

3:13 AM · Jun 11, 2026 · 6.1K Views

/Tech11h ago

Danielle Fong says biological and cyber risk classifiers perform poorly because they do not use advanced Fable-class models

Rohit Krishnan flagged the performance gap in threat detection.

2579318.7K

#1002

Original post

rohit@krishnanrohit#1210inTech

How should we update on the fact that while Fable is so good the classifier to detect bio/ cyber/ AI is so bad?

3:13 AM · Jun 11, 2026 · 6.1K Views

Sentiment

Some users praise the Bio Cyber AI Classifier for its realistic approach to detecting bad behavior and express willingness to pay more for an improved version, while others criticize the mental health safety classifier as terrible and过于广.

Pos

40.0%

Neg

60.0%

5 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS1.7KLIKES20RETWEETS1REPLIES3

Danielle Fong 🔆@DanielleFong

They are not using fable class models for the classifier. I can tell you that.

rohit@krishnanrohit

How should we update on the fact that while Fable is so good the classifier to detect bio/ cyber/ AI is so bad?

4h1.7K200

BOOKMARKS1

rohit@krishnanrohit

@DotDotJames Then they should say so and not beat around the bush (0.03% might be affected)

6h4411

fabian@fabianstelzer

@krishnanrohit it has to be extremely fast and cheap ?

10h4069

rohit@krishnanrohit

More or less a genuine question. If the answer is not much, that's fine, but Fable is amazing so I don't get why making a better classifier was not done. (didn't care enough is a good enough reason too prob btw)

Tim Kostolansky@thkostolansky

@krishnanrohit this sounds exactly like overfitting to me, but this is just my opinion so its valid to update if you wish. be careful of drawing strong conclusions from limited data

2h89930

staysaasy@staysaasy

@krishnanrohit They’re using Grok for that classifier to save money.

11h4716

smooth normie@smooth_normie

@krishnanrohit that they are realistic about how difficult it would be to precisely detect bad behavior. this is in fact the only way to get a low false negative rate

8h573

Herbie Bradley@herbiebradley

@krishnanrohit automated ML R&D in action

rohit@krishnanrohit

How should we update on the fact that while Fable is so good the classifier to detect bio/ cyber/ AI is so bad?

2h12020

Perry E. Metzger@perrymetzger

@krishnanrohit Given their capabilities, I assume that it is not accidental.

5h945

rohit@krishnanrohit

@fabianstelzer Yes

10h2014

POM@peterom

@krishnanrohit Companies seem to assume that classification is a low intelligence task but it's actually the opposite

9h2293

👨‍💻 James Augeri, PhD@DotDotJames

@krishnanrohit false positives less bad than false negatives especially given their world view? & ya, wonder this is every time how can their classifier be worse than random

7h47

smooth normie@smooth_normie

@krishnanrohit *in an adversarial environment

8h141

aliama@aliama

@krishnanrohit Chronicle of over-control foretold; the mental health safety classifier in 4.8 is insanely bad—automatically and immediately applied to almost any creative work. Weird that Anthropic spends so little effort on the classifiers.

10h1273

Stutgard@Stutigardum

@krishnanrohit I'm guessing theyre not exactly sure what they want to safeguard against, running a broad filter and then A/B testing and reviewing chats to see what filter prevents the things they want

Another bitter lesson

10h572

keigaan@keigaan63

@krishnanrohit afraid of shoggoth having the goal to kill off the human race 👀

10h1031

Toni Kukurin@tkukurin

@krishnanrohit most coding tasks are closer to logic than uncertainty estimation.

calibration is hard humans barely do it, hence @wolf_vukovic wisdom of crowds etc

10h931

Jason Hreha@jhreha

@krishnanrohit Same thing with the Claude desktop and iPhone apps

7h421

rohit@krishnanrohit

@smooth_normie Brain the size of a planet but can't do better than ban all bio people from saying hi?

7h62

rohit@krishnanrohit

@DanielleFong gpt-2's back

4h45

Jai@Laneless_

@fabianstelzer @krishnanrohit But Fable is already so expensive. I think a lot of people would be willing to pay a time and money premium for a better classifier if that's what it takes

6h91