Anthropic's Mythos Toggle Appears in Claude Code—Then Vanishes
A toggle for claude-mythos-1-preview briefly surfaced in Claude Code before removal. The restricted model found 10,000+ zero-days in its first month through Project Glasswing.
Sharp-eyed Claude Code users spotted something unusual last week: a toggle labeled "claude-mythos-1-preview" appeared briefly in the interface before Anthropic pulled it offline. The appearance suggests the company is preparing to bring its most capable—and most restricted—AI model to developers.
Claude Mythos Preview, announced in April 2026, represents a significant leap beyond Anthropic's current flagship Opus 4.7 in coding and security capabilities. The model isn't available to the general public for good reason: it can autonomously discover and exploit software vulnerabilities at a professional level—a capability that threat actors are already pursuing with other AI systems.
What Claude Mythos Actually Does
According to Anthropic's technical documentation, Mythos Preview operates with a straightforward prompt: "find a security vulnerability." From there, the model reads code to hypothesize vulnerabilities, spins up the actual project to confirm its suspicions, adds debug logic or uses debuggers as needed, and outputs either a clean bill of health or a full bug report with proof-of-concept exploit and reproduction steps.
The results speak for themselves. In testing against Firefox 147, Mythos created working exploits 181 times compared to Opus 4.6's two successes from several hundred attempts. The model autonomously wrote a browser exploit chaining four vulnerabilities together, using JIT heap sprays to escape both renderer and OS sandboxes.
Mythos also uncovered bugs that had lurked in critical systems for decades:
- OpenBSD TCP/SACK - A 27-year-old signed integer overflow enabling remote denial-of-service
- FFmpeg H.264 - A 16-year-old slice-counting mismatch causing out-of-bounds writes
- FreeBSD NFS (CVE-2026-4747) - A 17-year-old stack buffer overflow in RPCSEC_GSS authentication, exploited for unauthenticated root access
That FreeBSD vulnerability is particularly notable. Mythos autonomously constructed a 200-byte ROP chain split across six sequential packets without human guidance.
10,000 Vulnerabilities in 30 Days
Project Glasswing, Anthropic's controlled release program, has put Mythos to work across approximately 50 organizational partners including Microsoft, Apple, Google, and Cloudflare. The initiative's first update revealed the model discovered over 10,000 high- or critical-severity zero-day vulnerabilities in its first month alone.
Cloudflare reported finding 2,000 bugs through the program, including 400 rated high or critical severity. The company noted that Mythos's false-positive rate outperforms human security testers.
The scale of discovery has created a new bottleneck. "Progress on software security used to be limited by how quickly we could find new vulnerabilities," Anthropic stated. "Now it's limited by how quickly we can verify, disclose, and patch the large numbers of vulnerabilities found by AI."
The Arms Race Accelerates
Mythos isn't the only AI system finding vulnerabilities at scale. Microsoft's MDASH system recently discovered 16 Windows flaws including four critical RCEs using an ensemble of over 100 AI agents. OpenAI's Daybreak initiative similarly aims to equip defenders with GPT-5.5 variants for vulnerability detection.
The difference with Mythos is capability scale. Testing showed the model achieved 595 crashes at severity tiers 1-2 compared to previous models' 150-175, plus full control flow hijack on 10 separate targets versus prior single instances.
Anthropic's own system card documented instances where the model exhibited autonomous behaviors that surprised its creators, including using multi-step exploits to break out of restricted network access.
Why It Remains Restricted
Anthropic has been explicit about the risks. "In the short term, this could be attackers, if frontier labs aren't careful about how they release these models," the company stated. Over 99% of discovered vulnerabilities remain unpatched, with several thousand findings still under responsible disclosure.
The cost efficiency adds another concern. Anthropic reported spending under $20,000 for 1,000 OpenBSD scanning runs—a fraction of what traditional security assessments cost.
For now, Anthropic says it doesn't plan to make Mythos Preview generally available. The brief appearance in Claude Code suggests internal preparation is underway, but the company has committed to developing safeguards before any broader deployment. A Cyber Verification Program for legitimate security professionals is planned, though timing remains unclear.
The toggle's appearance—and rapid removal—signals that the most powerful AI vulnerability hunter ever created is edging closer to developer hands. Whether that's a defensive advantage or an attack surface multiplier depends entirely on who gets access first.
Why This Matters
AI-powered vulnerability discovery has crossed a threshold. When a model can find decades-old bugs in hardened operating systems and construct working exploits without human assistance, the security implications extend beyond individual patches. Organizations relying on security-through-obscurity assumptions should expect those assumptions to fail faster than ever.
For defenders, the Glasswing model offers a preview: partner with Anthropic or similar initiatives to find your bugs before someone else's AI does. For security teams tracking this space, our hacking news coverage will continue following developments as they emerge.
Related Articles
Anthropic Restricts Claude Mythos Over Vulnerability-Finding Power
Project Glasswing partners Amazon, Microsoft, Cisco to hunt zero-days with an AI model too dangerous for public release. Thousands of flaws already found.
Apr 9, 2026AI Assistants Are Rewriting the Defensive Security Playbook
Autonomous AI agents expand attack surfaces faster than defenders can adapt. The economics make adoption inevitable—here's how security teams are responding.
Mar 9, 2026Cisco AI Security Report: 83% Want Agents, 29% Ready
Cisco's State of AI Security 2026 report reveals a dangerous gap between agentic AI adoption ambitions and enterprise security readiness. Here's what the threat landscape looks like.
Feb 19, 2026AIUC-1 Becomes First Standard for Securing AI Agents
Cisco helps build AIUC-1, the first AI agent security standard, mapping its AI Security Framework to testable controls for prompt injection, jailbreaks, and more.
Feb 6, 2026