Mythos AI identifies one new curl vulnerability
Mythos AI identified one previously unknown vulnerability in the curl open-source library along with three false positives and one bug. Curl has roughly 20 billion installations and 30 years of maintenance with prior AI tools including AISLE, Zeropath, OpenAI’s Codex Security, GitHub Copilot, and Augment code. Those tools produced 200–300 merged bug fixes and more than a dozen CVEs in the last 8–10 months. The new finding showed no meaningful improvement over existing methods.
Blog is worth reading
But it should definitely lower your confidence that Mythos is a step change in zero-day discovery. We now have two points of evidence that this is not the case.
On OpenSSL, Mythos missed the majority of CVEs that we discovered in this security release: https://x.com/stanislavfort/status/2047700033449963578 (but co-discovered one, so we know they were looking)
On FreeBSD, the Anthropic's chosen project in their announcement for the model, we actually match them 3-for-3 in terms of CVEs in this release:
(I am running @Aisle_Inc mentioned in the blog post snippet you shared)
AISLE has discovered 20 of 23 OpenSSL zero-days (CVEs) across the last 3 consecutive security releases Latest release: 5 of 7 are AISLE 1 was co-reported by Anthropic (Mythos?) 63 days after AISLE OpenSSL encrypts 2/3 of the internet 10 fixes accepted straight into production
Don't know about curl, it's always hard to do retrospectives. But on FreeBSD:
1) Anthropic ran Mythos on FreeBSD 2) reported a bunch of issues that got patched 3) missed many others 4) got 3 CVEs in April 2026
after that @Aisle_Inc 5) scanned FreeBSD 6) found a bunch of issues 7) got 3 CVEs in April 2026 8) there were not "co-reported" ones => they were not reported at all by Mythos
So not only did we match it on CVE ("zero-day") count, but we also came **after** they already reported and fixed a few bad ones
I think this is the cleanest test you can imagine.
@stanislavfort OK, it might not be a step change in discovery per se, but how many of those curl vulns do you think it would have found it they weren't already patched? I get the impression that it's a big success in token efficiency, if not overhyped capabilities.
This is of course not evidence that nothing else can be found by a Mythos+. But more importantly, finding just one new vuln is already a big deal – curl is 30 years old, it's benefited enormously from Linus's Law, and *it had already been extensively scrutinized by AI tools*.

@stanislavfort OK, it might not be a step change in discovery per se, but how many of those curl vulns do you think it would have found it they weren't already patched?
I get the impression that it's a big success in token efficiency, if not overhyped capabilities.
But it should definitely lower your confidence that Mythos is a step change in zero-day discovery. We now have two points of evidence that this is not the case. On OpenSSL, Mythos missed the majority of CVEs that we discovered in this security release: https://x.com/stanislavfort/status/2047700033449963578 (but co-discovered one, so we know they were looking) On FreeBSD, the Anthropic's chosen project in their announcement for the model, we actually match them 3-for-3 in terms of CVEs in this release: (I am running @Aisle_Inc mentioned in the blog post snippet you shared)
Pretty compelling argument You don't need Mythos for SoTA vulnerability search and in fact AISLE seems to be the actual SoTA! I think Anthropic is right to be proud/concerned about their new general purpose LLM, but as I've said before: cybersecurity is NOT about scale.
Don't know about curl, it's always hard to do retrospectives. But on FreeBSD: 1) Anthropic ran Mythos on FreeBSD 2) reported a bunch of issues that got patched 3) missed many others 4) got 3 CVEs in April 2026 after that @Aisle_Inc 5) scanned FreeBSD 6) found a bunch of issues 7) got 3 CVEs in April 2026 8) there were not "co-reported" ones => they were not reported at all by Mythos So not only did we match it on CVE ("zero-day") count, but we also came **after** they already reported and fixed a few bad ones I think this is the cleanest test you can imagine.




