Calif researchers discover macOS kernel vulnerabilities using Mythos · Digg

/Tech46d ago

Calif researchers discover macOS kernel vulnerabilities using Mythos

AI Judge changed title after evaluation, original title: "Calif researchers uncover macOS vulnerabilities in Mythos AI testing"

Calif researchers discovered two previously unknown macOS kernel vulnerabilities while testing an early build of Anthropic’s Mythos AI in April. They chained the flaws with additional techniques to corrupt memory and bypass Apple’s hardware-backed protections, reaching inaccessible device regions. The privilege escalation exploit was completed in five days of evaluation. The team submitted a 55-page report on the findings to Apple.

2755.3K4641.1K801.9K

N🔍|@NATALIA__COELHO

PR|@DEREDLERITT3R

MB#1882|@MATTHEWJBAR

ZE#1695|@ZEPHYR_Z9

AD#1537|@DORIALEXANDER

RP#1257|@ROHANPAUL_AI

AC#682|@ANDREWCURRAN_

BC#369|@BCHERNY

EM#184|@EMOLLICK

BB#154|@BOAZBARAKTCS

RA#102|@_AROHAN_

Original post

Zephyr@zephyr_z9#1695in/Tech

Well, well, well... Good for hardening the system

Wall St Engine@wallstengine

ANTHROPIC’S MYTHOS HELPED RESEARCHERS FIND MACOS BUGS

WSJ reports Calif researchers used techniques from Anthropic’s Mythos AI testing to find two macOS bugs.

The bugs were linked into a privilege escalation exploit.

Apple is reviewing a 55-page report and details are expected

8:09 AM · May 14, 2026 · 25.5K Views

Sentiment

Positive users celebrate Mythos AI legitimately cracking macOS security in days as proof of real progress from Anthropic, while negative users dismiss the reports as exaggerated marketing theater or underhanded fear-mongering.

Pos

45.5%

Neg

54.5%

11 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Related links

Apple’s Security Has Been Tough to Crack. Mythos Helped Find a Way In.

THE WALL STREET JOURNALVia

How fast is autonomous AI cyber capability advancing? | AISI Work

AI SECURITY INSTITUTEVia

XBOW - Mythos for Offensive Security: XBOW's Evaluation

XBOW.COMVia

First public macOS kernel memory corruption exploit on Apple M5

CALIF.IOVia

Posts from X

Most Activity

Most Activity

VIEWS567.3KBOOKMARKS789LIKES3.8KRETWEETS361REPLIES94

Andrew Curran@AndrewCurran_

Mythos has cracked MacOS. It took five days.

46d|Views 567.3KLikes 3.8KBookmarks 789

Ethan Mollick@emollick

The Second Scaling Law remains undefeated. If you want better hacking (or math, or science, or crossword puzzle solving) out of an LLM, just add thinking tokens. There doesn't seem to be any plateau so far.

Natália 🔍@natalia__coelho

Very important update from UK AISI. This is a meaningful change from the previous report. Here’s what the new data would look like for “Mythos Preview (new)” with $ on the x-axis:

46d|Views 32.6KLikes 282Bookmarks 68

Natália 🔍@natalia__coelho

Very important update from UK AISI. This is a meaningful change from the previous report. Here’s what the new data would look like for “Mythos Preview (new)” with $ on the x-axis:

AI Security Institute@AISecurityInst

Our cyber range results illustrate this step-up. Since our first Mythos evaluation, we received access to a newer Mythos Preview checkpoint. On a 32-step corporate network attack we estimate takes a human expert ~20 hours, this checkpoint completes the full attack in 6 /10 attempts.

46d|Views 60.2KLikes 159Bookmarks 43

Rohan Paul@rohanpaul_ai

WSJ: Anthropic’s Mythos helped researchers find 2 unknown macOS kernel bugs and turn them into a working privilege escalation exploit in 5 days.

The target was the macOS kernel, the deepest layer of Apple’s desktop operating system, where code controls memory, processes, permissions, and access to hardware.

Mythos helped connect 2 separate flaws with extra exploitation techniques, which means the attack did not rely on one bug but on a chain where each step made the next step possible.

The exploit allegedly corrupted memory, bypassed Apple’s memory integrity protections, and gained access to protected parts of the system that normal apps should never reach.

This is serious because modern macOS defenses are built to make memory bugs hard to convert into control of the machine, not just hard to find.

Mythos can become so powerful here because vulnerability research is a search problem with many dead ends, where the model can help form hypotheses, inspect code behavior, reason across low-level constraints, and suggest exploit paths faster than manual work alone.

---

wsj .com/tech/ai/anthropic-mythos-apple-macos-bug-339da403

46d|Views 15.5KLikes 159Bookmarks 34

Andrew Curran@AndrewCurran_

Head of Frontier Red-Team at Anthropic.

Logan Graham@logangraham

A lot of people have been wondering about Mythos, Glasswing, and the vulns we / our partners are fixing. Today, I’m excited for us to start sharing more. (For context, I lead Glasswing @AnthropicAI.)

Two independent evaluations this week—from XBOW and the UK AISI—confirm what we've been seeing internally: Claude Mythos Preview is a step change in autonomous cybersecurity capabilities. We need to start preparing fast for a world of models with this level of capabilities.

The UK AI Security Institute tested the model we shipped at the launch of Project Glasswing and found Mythos Preview is the first model to solve both of their end-to-end cyber ranges, including one (Cooling Tower) which no model had ever cleared. But attackers (and defenders) have sophistication & cost constraints – Mythos is also the only model that clears every one of their tasks estimated over 8 hours under their deliberately low 2.5M-token cap.

XBOW tested it on their offensive security benchmarks, finding "token-for-token, unprecedented precision." It's the only model to succeed at subtle V8 sandbox work.

Other Glasswing partners shared similar stories. In a few weeks of testing, Mythos Preview has helped them find many thousands of (estimated) high + critical severity vulnerabilities, sometimes double what they'd normally find in a year.

I don't share this to boost Mythos. In fact, this is not about Mythos. It’s about preparing for the coming world of models being better, faster, cheaper, and more creative than some of the best human experts at dual use capabilities. Clearly, we need them supporting defenders as widely as can be done safely – and especially the least resourced ones.

Within a year, Mythos will probably look quite dumb (relative to other new models). And others may release openly available or unguardrailed models of Mythos-level capabilities.

We started Project Glasswing because capabilities like Mythos Preview's won't stay rare, or stay in careful hands. We are bringing it to defenders as fast as we responsibly can, while working to figure out, for example, the right safeguards and patching & disclosure processes.

Also, to be clear, compute has never been a limiter in our rollout.

Expect a fuller update on our Glasswing work in the coming days.

XBOW report: https://xbow.com/blog/mythos-offensive-security-xbow-evaluation

UK AISI report: https://www.aisi.gov.uk/blog/how-fast-is-autonomous-ai-cyber-capability-advancing

46d|Views 20.1KLikes 180Bookmarks 33

Andrew Curran@AndrewCurran_

https://www.wsj.com/tech/ai/anthropic-mythos-apple-macos-bug-339da403

Andrew Curran@AndrewCurran_

Mythos has cracked MacOS. It took five days.

46d|Views 15.4KLikes 88Bookmarks 20

prinz@deredleritt3r

One note of caution about the UK AISI cybersecurity benchmarks (and in particular TLO) is that neither Mythos nor GPT-5.5 was given the chance to "plateau" on the benchmark to measure the model's ceiling.

Mythos performs better on this benchmark per token, but GPT-5.5 outperforms Mythos per dollar spent.

Natália 🔍@natalia__coelho

Very important update from UK AISI. This is a meaningful change from the previous report. Here’s what the new data would look like for “Mythos Preview (new)” with $ on the x-axis:

46d|Views 10.1KLikes 99Bookmarks 13

Andrew Curran@AndrewCurran_

From Calif:

Ollin Boer Bohan@madebyollin

@AndrewCurran_ Direct link to the blog from @calif_io https://blog.calif.io/p/first-public-kernel-memory-corruption

46d|Views 13.1KLikes 56Bookmarks 12

Andrew Curran@AndrewCurran_

'Mythos Preview is powerful: once it has learned how to attack a class of problems, it generalizes to nearly any problem in that class.'

Andrew Curran@AndrewCurran_

https://blog.calif.io/p/first-public-kernel-memory-corruption

46d|Views 7.4KLikes 47Bookmarks 5

Andrew Curran@AndrewCurran_

https://blog.calif.io/p/first-public-kernel-memory-corruption

46d|Views 7.3KLikes 35Bookmarks 7

Alexander Doria@Dorialexander

narrator voice: it was not gpt-2

Andrew Curran@AndrewCurran_

Mythos has cracked MacOS. It took five days.

46d|Views 4.6KLikes 42Bookmarks 4

Natália 🔍@natalia__coelho

This means that the choice of x-axis is meaningful. Performance rankings can change based on whether you choose tokens, $, seconds, or something else on the x-axis. Each one, on its own, is incomplete. So it’s helpful to have the full picture.

46d|Views 1.5KLikes 25Bookmarks 3

Andrew Curran@AndrewCurran_

@tymrtn Calif says they will publish a 55 page report after Apple ships a fix.

46d|Views 3.6KLikes 20Bookmarks 2

Natália 🔍@natalia__coelho

XBOW has a good article on what happens when you measure performance with: - a fixed “action” budget - a fixed token budget - a fixed $ budget

46d|Views 962Likes 9Bookmarks 4

rohan anil@_arohan_

Uh oh. Captain, its just thursday

Andrew Curran@AndrewCurran_

Mythos has cracked MacOS. It took five days.

46d|Views 3.9KLikes 26Bookmarks 2

Will@wryzx

Whaaaat. No way. That’s pretty cool if they’re talking about the Secure Enclave Processor, because then it means that Apple’s key invalidation per OS install is insufficient security. Old Intels used to write zeroes to make data “unrecoverable” between OS installs. That is cool it found a way through it, because it also means that Apple can better test future OS releases. I really hope they or Apple do a write up on it, once it’s patched. Because Apple has staked a lot on the robustness of the Secure Enclave.

46d|Views 2.6KLikes 5Bookmarks 3

Natália 🔍@natalia__coelho

The new (correct) Mythos Preview checkpoint outperforms other models on the TLO cyber range per token, though not per dollar.

UK AISI did not spend enough tokens to see performance plateau for either model. Both would have kept getting better if given more tokens.

46d|Views 1.7KLikes 25Bookmarks 1

Ollin Boer Bohan@madebyollin

@AndrewCurran_ Direct link to the blog from @calif_io https://blog.calif.io/p/first-public-kernel-memory-corruption

46d|Views 13.7KLikes 7Bookmarks 3

Joman 👺@jomangblandino

@AndrewCurran_ Now I understand why the Microsoft engineer decided to become a goose farmer

46d|Views 2.5KLikes 29

Natália 🔍@natalia__coelho

It’s a great read! But one caution when reading XBOW’s article: some charts have % on the y-axis, others have odds. Odds can make small differences close to 100% (or 0%) appear larger than a %-based chart. https://xbow.com/blog/mythos-offensive-security-xbow-evaluation

46d|Views 886Likes 7Bookmarks 2

Load more posts

2755.3K4641.1K801.9K