/Tech20d ago

AVERI Executive Director Miles Brundage warns of overconfidence in the Pangram system in a brief, no-detail tease

The post provides no technical benchmarks or performance data.

41466212538.9K

#57

Original post

Miles Brundage@Miles_Brundage#57inTech

Many of you are vastly overconfident in Pangram

11:09 PM · May 25, 2026 · 37.8K Views

Sentiment

Positive users believe confidence in Pangram is warranted after testing, while negative users object to it as premature and ineffective given flaws like missed style-matched text and lack of traction.

Pos

15.4%

Neg

84.6%

14 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS7.4KBOOKMARKS10LIKES81

Seth Lazar@sethlazar

I have been testing it *very* extensively, and I think that, as of right now, the confidence is warranted. However I do think that people should have a clearer sense of what it's saying when it says something is 100% AI generated. It breaks the item down into chunks, generally 350-400 words, and makes a prediction about whether that chunk contains some AI. So "This paper is 100% AI generated" really means "100% of the tokens in this paper are in a chunk that we believe has AI in it".

20d7.4K8110

RETWEETS17

Miles Brundage@Miles_Brundage

Many of you are vastly overconfident in Pangram

20d37.8K45625

REPLIES3

trey@Comparativist

@gbrl_dick @Miles_Brundage for me… just looks like a moral panic around AI generated text. And I’ve seen plenty of false positive examples and know decent prompting gets around it.

20d2886

Jason Dean@_Jason_Dean_

@Miles_Brundage @adastroworld Can you show a way to consistently write human-written text that Pangram identifies as AI?

If you can’t, then its usefulness is obvious (given that its false negative rate is also decent)

20d1.2K221

Gabriel@gbrl_dick

@Miles_Brundage @Comparativist why do you think so miles? and in what direction?

20d2.1K171

Itai Sher@itaisher

@sethlazar @Miles_Brundage That’s very different than what people think it means

20d75514

Sean@sean_from_earth

@Miles_Brundage I have full confidence that it is not effective and will be a distant memory in 6 months.

20d92414

Mathias Kirk Bonde@BondeKirk

@Miles_Brundage My read is that the false positive is negligible, but the false negative is high.

Is that right?

20d57251

Gabriel@gbrl_dick

@Comparativist @Miles_Brundage oh interesting — i think there’s something to what you say on the moral panic side, but from my (cursory, low expertise) review the false positive rates seem quite low.

20d23081

Latent Node@latent_node

@Miles_Brundage There is an open source version based on some of the data they released that seems to work so may be they are ok. https://huggingface.co/spaces/adaptive-classifier/ai-detector

20d78911

Tuhin Chakrabarty@TuhinChakr

@Miles_Brundage Can you empirically show why ? I feel this kind of rhetoric doesn’t help and leads to embarrassing situations like the AI short story awarded the commonwealth prize

Miles Brundage@Miles_Brundage

Many of you are vastly overconfident in Pangram

20d51850

teo@teodorio

@Miles_Brundage @47fucb4r8c69323

20d883

Seth Lazar@sethlazar

@itaisher @Miles_Brundage Yep

20d7025

Miles Brundage@Miles_Brundage

@TuhinChakr I don't think that situation was caused by humility re: Pangram, but by people who didn't even bother running it and hadn't spent much time using AI at all in order to notice the warning signs (also see my replies elsewhere re: my more detailed thoughts)

Tuhin Chakrabarty@TuhinChakr

@Miles_Brundage Can you empirically show why ? I feel this kind of rhetoric doesn’t help and leads to embarrassing situations like the AI short story awarded the commonwealth prize

20d39010

Violeta Insights@violetainsights

@Miles_Brundage Confidence gets expensive the first time security, legal, and audit ask who approved the workflow, what was logged, and who owns the miss.

20d742

Tuhin Chakrabarty@TuhinChakr

@Miles_Brundage I found your replies unsatisfying. Watermarking has lots of limitations plus enforcing it across all models is a policy problem. I think definitive or suggestive is an interesting angle. FWIW if you don't use AI to write Pangram wont flag. If you do on the contrary idk

Miles Brundage@Miles_Brundage

20d13710

judah@joodalooped

@Miles_Brundage i mean i don't even need pangram to tell?

20d2683

Eric Bye@erictronai

@Miles_Brundage It takes me about 30 seconds to take a page from 100% ai to 100% human. Usually about two changes, if I already used a half assed prompt.

20d6392

trey@Comparativist

@gbrl_dick @Miles_Brundage as to the moral panic: it’s treating every type of AI assisted writing as the equivalent of a “ChatGPT, write an essay about X” like slopmills do. Was it Stanford that just banned even AI assisted brainstorming for law students? That’s literally crazy.

20d564

Ishan Khire@IshanKhire

@Miles_Brundage In what sense? False negatives maybe, false positives sound unlikely?

20d5392