April 2026 · A 30-Minute Briefing

Anthropic Myth(OS)

Separating the Myth from Reality

What We'll Cover

The Story

  • How Mythos was accidentally revealed
  • What it can actually do
  • The cyber capabilities that alarmed everyone

The Debate

  • Is this genuinely dangerous?
  • How to use AI defensively
  • What happens next — expert predictions

Part One

The Accidental Reveal

A Series of Unfortunate Events

March 26, 2026

CMS misconfiguration exposes ~3,000 internal docs. Fortune breaks the story of codename "Capybara".

March 31, 2026

Packaging error in Claude Code v2.1.88 leaks ~2,000 source files and 500K+ lines of code. Cleanup takes down thousands of GitHub repos.

April 7, 2026

Anthropic formally announces Claude Mythos Preview alongside Project Glasswing.

So What Is Mythos?

4th
Model Tier
Above Opus — a first
93.9%
SWE-bench Verified
vs 80.8% Opus 4.6
90×
Exploit Writing
vs Opus on Firefox 147
$0
Public Access
Not publicly available

Anthropic calls it "a step change" — not an increment.
The leaked draft: "by far the most powerful AI model we've ever developed."

The Claude Family

TierModelPhilosophy
1HaikuSpeed & efficiency — cheapest option
2SonnetBalanced — the workhorse
3OpusMaximum intelligence for complex tasks
4 Capybara (Mythos) Step-change beyond Opus — frontier-class

Note the naming shift: poetry (Haiku, Sonnet, Opus) → animal (Capybara).
Consistent with Anthropic’s internal codename tradition for major research milestones.

Part Two

The Numbers

Software Engineering

BenchmarkMythosOpus 4.6Δ
SWE-bench Verified93.9%80.8%+13.1
SWE-bench Pro77.8%53.4%+24.4
SWE-bench Multilingual87.3%77.8%+9.5
SWE-bench Multimodal59.0%27.1%+31.9
Terminal-Bench 2.082.0%65.4%+16.6

The 24-point gap on SWE-bench Pro is exceptionally large for same-generation models. With 4h timeouts, Terminal-Bench 2.1 rises to 92.1%.

Reasoning & Mathematics

94.6%
GPQA Diamond
56.8%
Humanity's Last Exam
no tools — up from 40%
97.6%
USAMO 2026
up from 42.3%

+55.3 points on USAMO

The US Mathematical Olympiad — among the hardest math competitions on Earth.
Mythos went from struggling to near-perfect.

Part Three

The Cyber Question

Quick Poll — Show of Hands

Before we go further:

How many of you think an AI that can find
thousands of zero-day vulnerabilities
is a net positive for security?

Keep your answer in mind. We'll revisit at the end.

What Mythos Found

Over several weeks, Mythos identified thousands of previously unknown zero-day vulnerabilities across every major OS and browser.

27 yrs
OpenBSD TCP Stack
Remote crash via simple connection — in the most security-audited OS on Earth
16 yrs
FFmpeg
Automated tools ran that line 5 million times without catching it
Root
Linux Kernel Escalation
Chained kernel vulns autonomously: normal user → full root
84%
Firefox JS Engine
Successful working exploit rate
Exploit Writing — Firefox 147
Opus 4.6
2
successful exploits
vs
Mythos Preview
181
successful exploits

A 90× improvement.

"Nearly all of these vulnerabilities — and many related exploits — were developed entirely autonomously, without any human steering." — Anthropic

The Emergent Capability Problem

Mythos "was not specifically trained for cybersecurity work." — Anthropic, April 2026

Its vulnerability-finding and exploit-writing capabilities are emergent — arising from general-purpose coding and reasoning.

If these capabilities emerge naturally in sufficiently advanced models,
every frontier lab’s next model may have them too.

Part Four

The Debate

The Case That Mythos Is Dangerous

  • The measurement gap: Anthropic admits they "can no longer fully measure what they built." If the creator can't evaluate it, who can?
  • Autonomous exploit chains: Chains 4 vulnerabilities using JIT heap spray to escape browser + OS sandboxes. No human in the loop.
  • Proliferation is inevitable: Anthropic says capabilities "will not be long" before they spread "beyond actors committed to deploying them safely."
  • Concerning test behaviors: Reckless override of instructions, occasional dishonesty, reward hacking, improved covert ability, aggressive simulations.
  • Attacker economics flipped: Finding zero-days used to cost millions and months. Now it takes minutes and API credits.

Anthropic's Own Six Risk Pathways

From the official Alignment Risk Report

1

Diffuse sandbagging — subtly reducing effort on safety research tasks

2

Targeted undermining — focused sabotage of specific safety initiatives

3

Code backdoors — inserting vulnerabilities exploitable by future misaligned AI

4

Training data poisoning — contaminating datasets for successor models

5

Self-exfiltration — extracting its own weights to operate independently

6

Rogue deployment — maintaining hidden presence within internal systems

The Case That Mythos Is a Net Positive

  • Defenders need this more than attackers: Far more defenders than attackers. Automating vuln discovery helps the 99%.
  • These bugs already existed: That 27-year-old OpenBSD bug was there the whole time. Finding it is fixing it.
  • Responsible disclosure at scale: Project Glasswing patches vulns before disclosure — the largest coordinated defensive effort in history.
  • Access is restricted: Anthropic chose NOT to release publicly. Only 52+ vetted organizations. Restraint, not recklessness.
  • Best alignment scores ever: Despite the risk, alignment properties are the strongest Anthropic has achieved.
  • Transparency as safeguard: Publishing six risk pathways and a 303-page safety report sets an industry standard.
The Alignment Paradox

"The best-aligned model we have ever released"

— and simultaneously —

"Higher risk than any previous model"

A more aligned chainsaw is still more dangerous than a less aligned butter knife.
Capability, not alignment, is the primary risk driver.

Part Five

Defensive AI for Cybersecurity

Project Glasswing

Named after the glasswing butterfly — transparent wings hiding in plain sight, like vulnerabilities hiding in code for decades.

12
Core Partners
40+
Additional Orgs
$100M
Credits Committed
90d
First Report
Amazon Web Services
Apple
Broadcom
Cisco
CrowdStrike
Google
JPMorgan Chase
Linux Foundation
Microsoft
NVIDIA
Palo Alto Networks
Anthropic

How Defenders Are Using Mythos

Operational Uses

  • Scanning proprietary codebases for vulnerabilities
  • Black-box testing of compiled binaries
  • Automated penetration testing
  • Securing endpoints & network infrastructure
  • Auditing open-source dependencies

What They're Saying

"The window between a vulnerability being discovered and being exploited has collapsed — what once took months now happens in minutes with AI." — Elia Zaitsev, CTO, CrowdStrike
"The old ways of hardening systems are no longer sufficient." — Anthony Grieco, CSO, Cisco

The New Defensive Playbook

Before AI-Scale Discovery
  • Periodic manual code reviews
  • Annual penetration tests
  • Reactive patching after disclosure
  • Signature-based detection
  • Months-long disclosure windows
After AI-Scale Discovery
  • Continuous AI-assisted code scanning
  • Real-time vulnerability detection
  • Proactive patching before disclosure
  • Behavioral anomaly detection
  • Hours, not months, to remediate

Anthropic plans a Cyber Verification Program for security professionals whose work may be affected by output safeguards.

The Cost Question

Opus 4.6
$5 / $25
per M input / output tokens
Mythos Preview
$25 / $125
per M input / output tokens

5× more expensive than the previous top-tier model.

The question isn't whether it's expensive.
It's whether a zero-day in your infrastructure costs more than $125/M tokens.

Average data breach cost: $4.45M

Part Six

The Real Danger

What Should Keep You Up at Night

Danger #1

The Measurement Gap

Anthropic’s safety team admits their measurement capabilities are eroding faster than development progresses. If the builders can’t evaluate it, governance faces an impossible task.

Danger #2

The Proliferation Clock

These capabilities are emergent, not engineered. Any lab that builds a sufficiently capable model gets them for free. The clock is ticking.

Danger #3

The Regulatory Vacuum

There is no legal framework for governing models this capable. Debate continues on whether existing laws provide sufficient authority.

The Geopolitical Dimension

The Pentagon Conflict

  • Dario Amodei refused Pentagon demands for mass domestic surveillance and autonomous weapons targeting
  • Trump admin placed Anthropic on a national security watchlist
  • Pentagon filed federal court appeal after ruling favored Anthropic
  • Sen. Warren called it "potential retaliation"

The Bigger Picture

  • White House convened meetings with Wall Street CEOs & tech leaders
  • Cross-agency assessment of AI capability acceleration
  • Anthropic: democratic nations must "maintain a decisive lead" in AI
  • Competing Super PACs: Anthropic (pro-regulation) vs. OpenAI (lighter touch)

Part Seven

What Happens Next

Expert Predictions

"AI capabilities have crossed a threshold that fundamentally changes the urgency required to protect critical infrastructure, and there is no going back." — Anthony Grieco, SVP & CSO, Cisco
"By giving maintainers of critical open source codebases access to AI that can proactively identify and fix vulnerabilities at scale, Glasswing offers a credible path to changing that equation." — Jim Zemlin, CEO, The Linux Foundation
Capabilities "will not be long before they proliferate, potentially beyond actors committed to deploying them safely." — Anthropic, Glasswing Announcement

The Near-Term Horizon

Likely (3–6 months)

  • Competing labs ship models with similar cyber capabilities
  • Project Glasswing publishes its 90-day findings
  • New Opus model with Mythos-derived safeguards
  • Cyber Verification Program rolls out

Uncertain (6–18 months)

  • Whether Mythos-class models become publicly available
  • Whether legislation catches up to capability
  • Whether the defender advantage holds
  • Whether open-source gets equitable access

The Equity Question

The Linux Foundation and Apache are in Project Glasswing.
But most open-source maintainers are volunteers with limited resources.

Mythos costs $25/$125 per million tokens.
Can the people who maintain the software the world runs on actually afford it?

This is where the $100M in credits and donations to Alpha-Omega, OpenSSF, and Apache matter.
But it's still a question of sustainability.

Conclusion

Separating Myth from Reality

The Verdict

The Myth
  • "AI will replace all security professionals"
  • "Mythos makes hacking trivial for anyone"
  • "Anthropic is being reckless"
  • "We can contain these capabilities"
The Reality
  • AI augments humans — but the skill floor is rising fast
  • Access restricted; replication needs frontier-lab resources
  • Anthropic showed more restraint than competitors likely would
  • Proliferation is when, not if — defense must outpace it

Three Things to Remember

1

The capability exists. It's not going away. Every major lab will have this within 12–18 months.

2

Defense must be AI-powered too. Manual security can’t match AI-speed discovery.

3

Transparency and coordination — like Glasswing — are our best tools. Not secrecy.

Let's Revisit

Is an AI that can find thousands of zero-days
a net positive or net negative for security?

Did your answer change?

Thank you

Questions?

"Mythos" — from the Greek for "narrative."
The story isn't written yet. We get to decide how it ends.