/Tech4h ago

Gary Marcus and Danielle Fong argue David Sacks misunderstands AI safety by claiming LLM jailbreaks are easily patched

Fong also disputed using export controls to regulate security exploits

43275202737.3K

#178

Original post

Adam Cochran (adamscochran.eth)@adamscochran

Really proves how little Sacks knows about AI…

“Just fix the jailbreak man”

Aggr News@AggrNews

DAVID SACKS SAYS ANTHROPIC'S EXPORT CONTROL WAS DUE TO COMPANY NOT WANTING TO FIX JAILBREAK

ADMIN HOPES THEY REMEDIATE SAFETY ISSUE TO LIFT EXPORT CONTROL: X

11:38 AM · Jun 13, 2026 · 9.8K Views

Sentiment

Many users distrust David Sacks' remarks on Anthropic jailbreak fixes and export controls, viewing them as propaganda or unreliable while a few agree with the critiques.

Pos

11.6%

Neg

88.4%

16 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS2.7K

David Hinkle@Drachs1978

@adamscochran @GaryMarcus This might be a come to Jesus moment if the regulators suddenly realize that jailbreak prevention isn't real.

4h2.7K2

LIKES14

Just Some Guy@JohnCrown1967

@alexandrosM We've adopted that framing because Anthropic did their level best to make it clear that it was a weapon, without actually using the word "weapon".

8h85414

RETWEETS15

Alexandros Marinos 🏴‍☠️@alexandrosM

I generally love Sacks here but this message is not demonstrating his usual quality of judgement. He is balancing between various internal stakeholders he needs to satisfy and delivering mush message that doesn't help explain the situation.

What does it mean that Anthropic was demanded to "fix the jailbreak"? How does he know it's easy? And why is export controls the right tool for the job?

Is this now the future? Whenever Pliny The Elder releases a jailbreak models become unavailable to products that use them downstream until the model can be.. put back in jail? Why are we adopting the ridiculous framing of Anthropic that considers their model to be a weapon?

David Sacks@DavidSacks

I’ve had a number of conversations with folks inside and outside government about the current situation with Anthropic, and here is what I believe to be true:

— As we know, Anthropic publicly released its Mythos class models earlier this week under the commercial name Fable.

— Fable is Mythos with guardrails. But if those guardrails fail, then you’ve exposed Mythos and its advanced cyber capabilities to people who shouldn’t have them. (Keep in mind that Anthropic itself widely promoted the idea that Mythos was a cyberweapon and needed to be regulated as such. They asked for government regulation of Mythos and championed the guardrails on Fable. If there is a vulnerability — big or small — it is Anthropic’s responsibility to patch.)

— A highly credible trusted partner of both Anthropic and the USG who was testing Fable came forward with a jailbreak of those guardrails. The Admin asked Dario to fix the jailbreak or de-deploy the model. Dario refused.

— In their blog post, Anthropic defended its decision by saying the jailbreak isn’t serious. That is not what the trusted partner and the USG believe; nor is that kind of minimizing language consistent with Anthropic’s brand as the AI safety company. It’s difficult to fathom how they could claim a jailbreak allowing operability of a cyber weapon could be defined as not “serious.”

— In the past, Anthropic has always said that safety must be top priority and taken super seriously. In this case, Anthropic prioritized the continued offering of the consumer model over safety.

— In reaction, the Admin issued the export control. The Admin did this reluctantly. It’s been very surprised that Anthropic hasn’t wanted to cooperate with a reasonable safety request (ie fixing the jailbreak issue). Anthropic’s reaction is very much at odds with their branding and ethos as a safe AI research community.

— The Admin’s hope now is that Anthropic remediates the safety issue, the export control is lifted, and Fable goes back into general release. The Admin wants all of this to happen as soon as possible. It is frankly bewildered that Anthropic hasn’t wanted to comply with safety requests that it previously said were its highest priority.

— Those trying to misdirect and tie this action to the prior DoW/Anthropic issues are wrong. The Admin values Anthropic’s technical capabilities and feels that this issue, while serious, should be easily resolved. The ball is in Anthropic’s court.

8h27.8K21924

REPLIES3

Alexandros Marinos 🏴‍☠️@alexandrosM

@JohnCrown1967 And is their success in framing the debate a good thing for the future?

7h7369

John Ennis@johnennis

@alexandrosM It seems to me the central issue is Anthropic trying to have its cake and eat it too

If they had just stayed under the radar like OpenAI has recently, progressively releasing better models without hysterical fear mongering, probably this situation would not have happened

6h35813

ir0niv4n@ir0nivan

@markankcorn @alexandrosM @elder_plinius You're missing the point entirely. 1. Pliny was brought up because he always jailbreaks models as soon as they come out 2. Fixing jailbreaking is like fixing hallucinations - nobody knows how to do it. And if you agree to "fix a jailbreak" you're opening the door to insanity.

3h171

ir0niv4n@ir0nivan

@markankcorn @alexandrosM @elder_plinius 1. Adding KYC is very far from "a simple fix", I'm sure that there's some equivalent of Hawking's quote regarding sales as a function of the number of equations in a book 2. This would not a "fix" for the jailbreak at all

2h101

BP@BP_Gamma

@alexandrosM There's actually a super easy solution. Don't claim your model is a cyberweapon, and you're in the clear.

5h2241

Note Able@curiousgangsta

@alexandrosM maybe don't self-classify (market) ur shit as a potential national security threat while having a history of trying to dictate how the usg legally uses ur tech while also advocating for the govts ability to block model releases they deem dangerous.

6h1675

JD@JustDeprecated

@alexandrosM I appreciate your framing but I do think it's distinct from what Sachs is getting at. He is presenting an "objective" view of what he has heard where the judgments made are explanations, not his opinion. Your call-out fits better as an additive instead of a corrective imo

7h449

Didymus@BasilEsq_

@alexandrosM If the government are so easily swayed by a jailbreak demo, they would be equally impressed by a good faith band-aid and would consider future turns of the crank as evidence of partnership. Saying “it’s not important; your trusted partner is full of shit” is not a good approach.

4h1221

Mark Ankcorn@markankcorn

@alexandrosM It was Amazon not @elder_plinius and they offered the fix that both Sacks and Andy Jassey thought was pretty minimal, and Dario rejected

4h1061

Aykut Uz@aykutuz

@alexandrosM That may be the answer:

7h3214

Dave Turney@wwwdaveturney

@alexandrosM clearly propaganda, which makes the govt case here extremely suspect.

7h861

🇺🇸dire_wolf_alpha, America first🇺🇸@direwolfalpha1

@alexandrosM @grok did Dario call Mythos a cyber weapon?

5h117

FireKraKer@Firekraker72

@alexandrosM I thought he explained it very well honestly

4h181

Grokx Fan 🇺🇸@tdsfixer

@alexandrosM Does @DavidSacks or @foundersfund hold @AnthropicAI shares in their private funds ? @grok List all AI companies Pvt shares they hold

6h59

Mark Ankcorn@markankcorn

@ir0nivan @alexandrosM @elder_plinius Not missing the point at all. I suspect the “fix” proposed was KYC or geofencing to US-only IP addresses and Dario balked because he doesn’t want to be seen as a pawn of the USG and/or lose int’l revenues. Sacks isnt technical, “jailbreak” here ≠ Pliny-esque exploit

3h121

Dave Turney@wwwdaveturney

@G2DJordan @alexandrosM Ol Pete posted this. Its connected.

4h31

Jordan | G2 Dynamics@G2DJordan

@wwwdaveturney @alexandrosM One thing isn’t even remotely connected to the other and the only reason you would come to that conclusion is person bias. I doubt the same government that just snatch fable from everbody was begging for anything.

6h22