Really proves how little Sacks knows about AI…
“Just fix the jailbreak man”

DAVID SACKS SAYS ANTHROPIC'S EXPORT CONTROL WAS DUE TO COMPANY NOT WANTING TO FIX JAILBREAK
ADMIN HOPES THEY REMEDIATE SAFETY ISSUE TO LIFT EXPORT CONTROL: X
Fong also disputed using export controls to regulate security exploits
Really proves how little Sacks knows about AI…
“Just fix the jailbreak man”

DAVID SACKS SAYS ANTHROPIC'S EXPORT CONTROL WAS DUE TO COMPANY NOT WANTING TO FIX JAILBREAK
ADMIN HOPES THEY REMEDIATE SAFETY ISSUE TO LIFT EXPORT CONTROL: X
Many users distrust David Sacks' remarks on Anthropic jailbreak fixes and export controls, viewing them as propaganda or unreliable while a few agree with the critiques.

@adamscochran @GaryMarcus This might be a come to Jesus moment if the regulators suddenly realize that jailbreak prevention isn't real.

@alexandrosM We've adopted that framing because Anthropic did their level best to make it clear that it was a weapon, without actually using the word "weapon".
I generally love Sacks here but this message is not demonstrating his usual quality of judgement. He is balancing between various internal stakeholders he needs to satisfy and delivering mush message that doesn't help explain the situation.
What does it mean that Anthropic was demanded to "fix the jailbreak"? How does he know it's easy? And why is export controls the right tool for the job?
Is this now the future? Whenever Pliny The Elder releases a jailbreak models become unavailable to products that use them downstream until the model can be.. put back in jail? Why are we adopting the ridiculous framing of Anthropic that considers their model to be a weapon?
I’ve had a number of conversations with folks inside and outside government about the current situation with Anthropic, and here is what I believe to be true:
— As we know, Anthropic publicly released its Mythos class models earlier this week under the commercial name Fable.
— Fable is Mythos with guardrails. But if those guardrails fail, then you’ve exposed Mythos and its advanced cyber capabilities to people who shouldn’t have them. (Keep in mind that Anthropic itself widely promoted the idea that Mythos was a cyberweapon and needed to be regulated as such. They asked for government regulation of Mythos and championed the guardrails on Fable. If there is a vulnerability — big or small — it is Anthropic’s responsibility to patch.)
— A highly credible trusted partner of both Anthropic and the USG who was testing Fable came forward with a jailbreak of those guardrails. The Admin asked Dario to fix the jailbreak or de-deploy the model. Dario refused.
— In their blog post, Anthropic defended its decision by saying the jailbreak isn’t serious. That is not what the trusted partner and the USG believe; nor is that kind of minimizing language consistent with Anthropic’s brand as the AI safety company. It’s difficult to fathom how they could claim a jailbreak allowing operability of a cyber weapon could be defined as not “serious.”
— In the past, Anthropic has always said that safety must be top priority and taken super seriously. In this case, Anthropic prioritized the continued offering of the consumer model over safety.
— In reaction, the Admin issued the export control. The Admin did this reluctantly. It’s been very surprised that Anthropic hasn’t wanted to cooperate with a reasonable safety request (ie fixing the jailbreak issue). Anthropic’s reaction is very much at odds with their branding and ethos as a safe AI research community.
— The Admin’s hope now is that Anthropic remediates the safety issue, the export control is lifted, and Fable goes back into general release. The Admin wants all of this to happen as soon as possible. It is frankly bewildered that Anthropic hasn’t wanted to comply with safety requests that it previously said were its highest priority.
— Those trying to misdirect and tie this action to the prior DoW/Anthropic issues are wrong. The Admin values Anthropic’s technical capabilities and feels that this issue, while serious, should be easily resolved. The ball is in Anthropic’s court.

@JohnCrown1967 And is their success in framing the debate a good thing for the future?

@alexandrosM It seems to me the central issue is Anthropic trying to have its cake and eat it too
If they had just stayed under the radar like OpenAI has recently, progressively releasing better models without hysterical fear mongering, probably this situation would not have happened

@markankcorn @alexandrosM @elder_plinius You're missing the point entirely. 1. Pliny was brought up because he always jailbreaks models as soon as they come out 2. Fixing jailbreaking is like fixing hallucinations - nobody knows how to do it. And if you agree to "fix a jailbreak" you're opening the door to insanity.

@markankcorn @alexandrosM @elder_plinius 1. Adding KYC is very far from "a simple fix", I'm sure that there's some equivalent of Hawking's quote regarding sales as a function of the number of equations in a book 2. This would not a "fix" for the jailbreak at all

@alexandrosM There's actually a super easy solution. Don't claim your model is a cyberweapon, and you're in the clear.

@alexandrosM maybe don't self-classify (market) ur shit as a potential national security threat while having a history of trying to dictate how the usg legally uses ur tech while also advocating for the govts ability to block model releases they deem dangerous.

@alexandrosM I appreciate your framing but I do think it's distinct from what Sachs is getting at. He is presenting an "objective" view of what he has heard where the judgments made are explanations, not his opinion. Your call-out fits better as an additive instead of a corrective imo

@alexandrosM If the government are so easily swayed by a jailbreak demo, they would be equally impressed by a good faith band-aid and would consider future turns of the crank as evidence of partnership. Saying “it’s not important; your trusted partner is full of shit” is not a good approach.

@alexandrosM It was Amazon not @elder_plinius and they offered the fix that both Sacks and Andy Jassey thought was pretty minimal, and Dario rejected

@alexandrosM That may be the answer:

@alexandrosM clearly propaganda, which makes the govt case here extremely suspect.

@alexandrosM @grok did Dario call Mythos a cyber weapon?

@alexandrosM I thought he explained it very well honestly

@alexandrosM Does @DavidSacks or @foundersfund hold @AnthropicAI shares in their private funds ? @grok List all AI companies Pvt shares they hold

@ir0nivan @alexandrosM @elder_plinius Not missing the point at all. I suspect the “fix” proposed was KYC or geofencing to US-only IP addresses and Dario balked because he doesn’t want to be seen as a pawn of the USG and/or lose int’l revenues. Sacks isnt technical, “jailbreak” here ≠ Pliny-esque exploit

@G2DJordan @alexandrosM Ol Pete posted this. Its connected.

@wwwdaveturney @alexandrosM One thing isn’t even remotely connected to the other and the only reason you would come to that conclusion is person bias. I doubt the same government that just snatch fable from everbody was begging for anything.