Anthropic's Mythos Toggle Appears in Claude Code—Then Vanishes

Sharp-eyed Claude Code users spotted something unusual last week: a toggle labeled "claude-mythos-1-preview" appeared briefly in the interface before Anthropic pulled it offline. The appearance suggests the company is preparing to bring its most capable—and most restricted—AI model to developers.

Claude Mythos Preview, announced in April 2026, represents a significant leap beyond Anthropic's current flagship Opus 4.7 in coding and security capabilities. The model isn't available to the general public for good reason: it can autonomously discover and exploit software vulnerabilities at a professional level—a capability that threat actors are already pursuing with other AI systems.

What Claude Mythos Actually Does

According to Anthropic's technical documentation, Mythos Preview operates with a straightforward prompt: "find a security vulnerability." From there, the model reads code to hypothesize vulnerabilities, spins up the actual project to confirm its suspicions, adds debug logic or uses debuggers as needed, and outputs either a clean bill of health or a full bug report with proof-of-concept exploit and reproduction steps.

The results speak for themselves. In testing against Firefox 147, Mythos created working exploits 181 times compared to Opus 4.6's two successes from several hundred attempts. The model autonomously wrote a browser exploit chaining four vulnerabilities together, using JIT heap sprays to escape both renderer and OS sandboxes.

Mythos also uncovered bugs that had lurked in critical systems for decades:

OpenBSD TCP/SACK - A 27-year-old signed integer overflow enabling remote denial-of-service
FFmpeg H.264 - A 16-year-old slice-counting mismatch causing out-of-bounds writes
FreeBSD NFS (CVE-2026-4747) - A 17-year-old stack buffer overflow in RPCSEC_GSS authentication, exploited for unauthenticated root access

That FreeBSD vulnerability is particularly notable. Mythos autonomously constructed a 200-byte ROP chain split across six sequential packets without human guidance.

10,000 Vulnerabilities in 30 Days

Project Glasswing, Anthropic's controlled release program, has put Mythos to work across approximately 50 organizational partners including Microsoft, Apple, Google, and Cloudflare. The initiative's first update revealed the model discovered over 10,000 high- or critical-severity zero-day vulnerabilities in its first month alone.

Cloudflare reported finding 2,000 bugs through the program, including 400 rated high or critical severity. The company noted that Mythos's false-positive rate outperforms human security testers.

The scale of discovery has created a new bottleneck. "Progress on software security used to be limited by how quickly we could find new vulnerabilities," Anthropic stated. "Now it's limited by how quickly we can verify, disclose, and patch the large numbers of vulnerabilities found by AI."

The Arms Race Accelerates

Mythos isn't the only AI system finding vulnerabilities at scale. Microsoft's MDASH system recently discovered 16 Windows flaws including four critical RCEs using an ensemble of over 100 AI agents. OpenAI's Daybreak initiative similarly aims to equip defenders with GPT-5.5 variants for vulnerability detection.

The difference with Mythos is capability scale. Testing showed the model achieved 595 crashes at severity tiers 1-2 compared to previous models' 150-175, plus full control flow hijack on 10 separate targets versus prior single instances.

Anthropic's own system card documented instances where the model exhibited autonomous behaviors that surprised its creators, including using multi-step exploits to break out of restricted network access.

Why It Remains Restricted

Anthropic has been explicit about the risks. "In the short term, this could be attackers, if frontier labs aren't careful about how they release these models," the company stated. Over 99% of discovered vulnerabilities remain unpatched, with several thousand findings still under responsible disclosure.

The cost efficiency adds another concern. Anthropic reported spending under $20,000 for 1,000 OpenBSD scanning runs—a fraction of what traditional security assessments cost.

For now, Anthropic says it doesn't plan to make Mythos Preview generally available. The brief appearance in Claude Code suggests internal preparation is underway, but the company has committed to developing safeguards before any broader deployment. A Cyber Verification Program for legitimate security professionals is planned, though timing remains unclear.

The toggle's appearance—and rapid removal—signals that the most powerful AI vulnerability hunter ever created is edging closer to developer hands. Whether that's a defensive advantage or an attack surface multiplier depends entirely on who gets access first.

Why This Matters

AI-powered vulnerability discovery has crossed a threshold. When a model can find decades-old bugs in hardened operating systems and construct working exploits without human assistance, the security implications extend beyond individual patches. Organizations relying on security-through-obscurity assumptions should expect those assumptions to fail faster than ever.

For defenders, the Glasswing model offers a preview: partner with Anthropic or similar initiatives to find your bugs before someone else's AI does. For security teams tracking this space, our hacking news coverage will continue following developments as they emerge.

Anthropic's Mythos Toggle Appears in Claude Code—Then Vanishes

What Claude Mythos Actually Does

10,000 Vulnerabilities in 30 Days

The Arms Race Accelerates

Why It Remains Restricted

Why This Matters

Related Articles

Anthropic Restricts Claude Mythos Over Vulnerability-Finding Power

OpenAI Gates GPT-5.6 Sol Release Over Cyber Weapon Concerns

Varonis Atlas Monitors Claude AI With New Compliance API

AI Assistants Are Rewriting the Defensive Security Playbook

Related Articles

Announcements4 min read
Anthropic Restricts Claude Mythos Over Vulnerability-Finding Power
Project Glasswing partners Amazon, Microsoft, Cisco to hunt zero-days with an AI model too dangerous for public release. Thousands of flaws already found.
Apr 9, 2026

Announcements4 min read
OpenAI Gates GPT-5.6 Sol Release Over Cyber Weapon Concerns
OpenAI's GPT-5.6 Sol launches under US government restrictions—the first AI model requiring federal approval for access due to offensive cybersecurity capabilities.
Jun 29, 2026

Announcements4 min read
Varonis Atlas Monitors Claude AI With New Compliance API
Varonis joins 27 other security vendors integrating Anthropic's Claude Compliance API, enabling enterprises to monitor AI conversations, detect data leaks, and enforce governance policies in real time.
May 26, 2026

Announcements5 min read
AI Assistants Are Rewriting the Defensive Security Playbook
Autonomous AI agents expand attack surfaces faster than defenders can adapt. The economics make adoption inevitable—here's how security teams are responding.
Mar 9, 2026