Mythos Inside: How a Discord Group Walked Into Anthropic’s ‘Too Dangerous to Release’ Model Through Mercor’s Backdoor

A viral tweet this week framed it as the nightmare scenario:

🚨 BREAKING: Anthropic’s Most Dangerous Model Ever Breached By Hackers — anthropic builds a cyberweapon — calls it mythos — “can hack every major OS and browser” — dario: “we’re the safe & responsible ai lab” — “can’t release it to the public” — Mercor (their training contractor) gets breached — leaks anthropic’s model naming conventions — hackers guess the URL pattern — contractor credentials still work — they’re inside

It’s the kind of summary that sounds too clean to be real. It’s also — minus the cyberweapon framing — almost exactly what happened.

What the tweet gets wrong is the headline. Mythos isn’t a cyberweapon Anthropic built and lost control of. It’s a general-purpose frontier model whose offensive cybersecurity capability turned out to be high enough that Anthropic decided not to release it commercially at all. Big difference, even if the operational result — unauthorized parties sitting at the wheel of one of the most capable vulnerability-discovery models ever shipped — is the same.

What it gets right is the chain. And the chain is the actual story, because every link in it was a known, documented, named-in-our-prior-coverage failure mode. None of this was novel. The entire breach reads like a checklist of supply chain attack patterns we’ve been writing about for months.

Let’s walk through it.

What Mythos actually is, and why Project Glasswing exists

In late March 2026, Anthropic accidentally exposed a draft blog post in a publicly accessible content store. The draft described an unreleased frontier model — internally codenamed Capybara, publicly Claude Mythos — and warned that it was, in the company’s own framing, far ahead of any other AI model on cyber capability. Cybersecurity stocks tanked on the news: CrowdStrike dropped roughly 7.5%, Palo Alto Networks over 6%, Zscaler and Okta in the 5–8% range. Wall Street priced in something serious before Anthropic had even formally announced the product.

On April 7, 2026, Anthropic made it official with Project Glasswing — a defensive-only coalition of 12 launch partners getting controlled access to Claude Mythos Preview: AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks, and Anthropic itself. An additional 40 critical-infrastructure organizations were granted access to use the model on their own first-party and open-source systems. Anthropic committed up to $100 million in usage credits and $4 million in direct donations to open-source security organizations.

The pitch: Mythos has already autonomously found thousands of zero-day vulnerabilities, including a 27-year-old integer overflow in OpenBSD and a 17-year-old remote code execution flaw in FreeBSD’s NFS implementation (CVE-2026-4747). Mozilla used a preview to identify and patch 271 vulnerabilities in Firefox. The company explicitly said it does not plan to make Mythos generally available because of dual-use risk. That position has been consistent across the technical write-up on red.anthropic.com, the Project Glasswing announcement, and Newton Cheng’s interviews with VentureBeat as Frontier Red Team Cyber Lead.

We’ve already covered the dilemma at length on the CISO Marketplace blog — the strategic question of how you govern a capability that’s too useful to lock away and too dangerous to ship. See Project Glasswing & Claude Mythos: The Cybersecurity Watershed and the follow-up on the market reaction in The Mythos Leak Cybersecurity Stocks Crash. The CISO Insights podcast unpacked the full arc — leaks, lawsuits, IPO pressure — in The Mythos Paradox.

That’s the model the Discord group walked into.

What happened: the unauthorized access, by the numbers

According to Bloomberg’s reporting, picked up by Fortune, Euronews, and others over the last several days:

  • A private Discord channel that monitors and discusses unreleased AI models accessed Mythos on launch day, April 10, 2026.
  • The group has been using the model continuously since then — not in a one-off probe, but as ongoing daily access.
  • One of the Discord participants is a third-party contractor for Anthropic, whose credentials still functioned.
  • The group’s entry path was guessing the deployment URL and naming pattern Anthropic used for the Mythos preview environment — guesses informed by prior knowledge of Anthropic’s file-system conventions and naming formats.
  • Anthropic’s official statement: the company is investigating a report claiming unauthorized access to Claude Mythos Preview through one of its third-party vendor environments, with no current evidence that Anthropic’s own systems were impacted or that activity extended beyond the vendor environment.

That last sentence is doing a lot of work. The vendor environment in question is the lever. The naming-format knowledge those URL guesses depended on came from somewhere specific.

It came from Mercor.

Mercor is an AI recruitment startup valued at $10 billion as of October 2025, contracting domain experts — doctors, lawyers, scientists — to train frontier AI models for OpenAI, Anthropic, and Meta, among others. The company facilitates over $2 million in daily contractor payouts and reportedly cleared $500 million in revenue.

On March 31, 2026, Mercor confirmed it was one of thousands of organizations impacted by a supply chain compromise of LiteLLM, an open-source AI gateway that routes API calls between models and is downloaded roughly 95 million times per month. The Lapsus$ extortion group claimed responsibility for stealing approximately 4 terabytes of Mercor data, listing it for auction on a dark web leak site.

The cache, per Lapsus$ and corroborating reporting from CybersecurityNews, The Register, SecurityWeek, and The CyberSec Guru, allegedly included:

  • 939 GB of Mercor platform source code
  • A 211 GB user database — candidate profiles, resumes, contact info, Social Security numbers from W-9 forms, ID documents
  • Roughly 3 TB of video interview recordings and identity verification documents — including passport and driver’s license scans, plus the facial biometric data Mercor used to match faces to IDs
  • Internal Slack communications and ticketing data
  • Tailscale VPN configurations
  • API keys, cloud credentials, SSH keys, database passwords, Kubernetes configs

That last category is what mattered for Anthropic. Mercor’s role as a training contractor for multiple frontier labs meant its environment held metadata, internal workflow patterns, and infrastructure conventions related to those labs. Per the reporting, the group that later compromised Mythos used naming-format knowledge — the predictable patterns Anthropic uses when standing up new model environments — that had been exposed in the Mercor leak. They didn’t need to brute-force the URL. They knew what shape it would take.

Combine that with a contractor credential that should have been rotated and wasn’t, and you have unauthorized access on day one of a model Anthropic publicly framed as the most capability-restricted release in its history.

We covered the upstream half of this story in detail when LiteLLM itself was first compromised: LiteLLM Compromised: How a Poisoned Security Scanner Backdoored 97 Million Downloads — the full reconstruction of how TeamPCP poisoned the Trivy security scanner via a misconfigured GitHub Actions workflow, used the stolen aqua-bot Personal Access Token to force-push malicious commits to 76 release tags, and rode that compromise downstream into LiteLLM’s PyPI publishing token. Versions 1.82.7 and 1.82.8 went up for roughly 40 minutes carrying a three-stage credential-harvesting backdoor. That window was enough.

The same campaign also produced Lapsus$ Claims 3GB AstraZeneca Hack and the source-code theft we covered in Cisco Source Code Stolen: How the Trivy Supply Chain Attack and ShinyHunters Cracked a Networking Giant. Mercor, AstraZeneca, Cisco, and now indirectly Anthropic — all of them downstream of the same poisoned scanner.

The compounding ironies

If you read this carefully, there’s an irony stacked on every layer:

  1. Anthropic publicly markets Mythos as the model that finds vulnerabilities in everyone else’s software. Then it gets reached through a vulnerability in its own contractor management — specifically, naming conventions that should never have been guessable in the first place, plus credentials that should have been rotated the moment Mercor’s breach was disclosed four weeks earlier.

  2. The compromise of Mercor came through LiteLLM, which compromise came through Trivy — a security scanner. Both LiteLLM and Mercor displayed SOC 2 Type I and ISO 27001 badges issued by Delve, a compliance-as-a-service startup that Strikegraph, TechCrunch, and a public whistleblower have since described as having sold what one source called fake compliance. The badges meant nothing. The audit had been mechanized into a rubber stamp.

  3. The model that found a 27-year-old OpenBSD bug in days was reached because a contractor’s credentials sat unrotated for at least 30 days after a public 4 TB breach disclosure. No model needed. Just credential hygiene that wasn’t done.

  4. Anthropic deliberately limited Mythos to 40 organizations to control the blast radius. Cybersecurity analyst Ian Lindner told Fortune that even at that scale, with each partner organization granting access internally to its own teams, you’re looking at thousands of individual humans with credentials. The leak wasn’t a question of if. It was a question of which contractor, working at which downstream vendor, would have their identity used first.

This is the same pattern we’ve documented across the Scattered Lapsus$ Hunters campaigns of 2025 — the Salesforce-Gainsight breach, the Qantas breach, the Salesloft Drift OAuth attacks. Sophisticated targets defended at the perimeter, defeated at the third party. The ShinyHunters retrospective we ran in August 2025 traced exactly this evolution: from opportunistic data thieves to sophisticated supply chain operators.

Threat actor calibration: who actually has access?

Bloomberg’s source describes the group as a Discord channel that seeks information about unreleased AI models — leak-watchers, model-tasters, the sort of community that catalogs unreleased weights and probes API endpoints. Per the reporting, the group has not been using Mythos for cyberattacks. They’ve been using it.

That’s not as reassuring as it sounds. Three reasons:

  • The same access path that worked for them works for whoever else figures it out. The Mercor data is sitting in attacker hands and has been auctioned in fragments. Lapsus$ communications referenced other targets under the codenames Athena and Aphrodite. The naming-pattern intelligence is not contained to one Discord.

  • Bloomberg also reports the group has access to other unreleased Anthropic models. Not just Mythos. The whole pipeline, in the words of one summary. If true, this is a sustained presence in vendor environments, not a single misconfiguration.

  • The framing of “they’re not attacking, just using it” relies on the group’s stated behavior. Not on Anthropic’s logs of the group’s behavior. Because, by Anthropic’s own statement, the activity stayed in the third-party vendor environment — meaning Anthropic’s primary visibility is into what the vendor logged, which is exactly the kind of telemetry that supply chain attacks are designed to operate beneath.

For context on how AI agent platforms in particular fail at this kind of containment, see our coverage of The Lethal Trifecta: Four Major AI Agent Vulnerabilities in Five Days. That January 2026 piece walked through how Anthropic’s Claude Cowork, IBM Bob, Notion AI, and Superhuman AI all fell to the same architectural pattern — private data access, untrusted content, external communication paths — within a single week. The Cowork vulnerability had been reported to Anthropic in October 2025 and remained unfixed at launch. The pattern of velocity over remediation isn’t new at Anthropic. It’s just at a higher stakes layer now.

What this means for defenders

If you’re a CISO, a security architect, or anyone responsible for vendor risk management, the Mythos breach offers the cleanest case study in years for several lessons we keep writing about and that keep going unimplemented:

Naming conventions are intelligence. Predictable URL patterns, consistent environment names, sequential staging-prod naming — these aren’t security through obscurity, but their absence is a real gift to attackers when combined with credential reuse. If your naming is guessable from a sample of one or two leaked endpoints, you’re trusting authentication alone to hold. Mythos shows what happens when authentication doesn’t.

Vendor breaches are your breaches, with a 30-day fuse. Mercor’s breach was confirmed publicly on March 31. Mythos launched April 10. That ten-day window — let alone the four weeks between Lapsus$‘s initial leak listing and the Mythos compromise — was the exact period in which every Mercor-touching organization should have rotated credentials, audited contractor access, and revoked dormant sessions. The fact that contractor credentials still worked when the Discord group walked in suggests this wasn’t done at the relevant vendor environment. The cost was access to Anthropic’s most controlled model.

Compliance certificates issued by compromised auditors are worse than no certificates at all. If your procurement process treats SOC 2 Type I and ISO 27001 badges as load-bearing signals — and your auditor turns out to be an automated stamp factory — you’ve been doing two things wrong simultaneously: trusting compliance as a proxy for security, and not validating the auditor. The Delve story we’ll be covering in more depth shortly is the broader version of this. For background on how compliance frameworks ought to function, see ComplianceHub.wiki on supply chain security requirements.

The blast radius of frontier AI vendor compromise is now strategic, not just commercial. A model that can autonomously generate working exploits in 83% of vulnerability discovery tasks is — in the wrong hands — a force multiplier across every CVE pipeline, every internal audit, every product line a Mercor-adjacent organization touches. The economics of a single contractor credential just changed. The Center for AI Policy and Anthropic’s own Frontier Red Team have been making this argument for months. The Mythos breach is the empirical demonstration.

Discord and forum communities are now intelligence channels, not just chat. The group that accessed Mythos was specifically organized around tracking unreleased AI models. This is consistent with the broader trend we documented in The Self-Reinforcing Supply Chain Cybercrime Economy — informal, distributed, fast-moving collectives with deep technical capability, increasingly indistinguishable from coordinated threat actor groups in their effects.

Where this goes from here

Three open questions, in order of how much they should worry you:

One: how many other Mercor-downstream environments still have unrotated contractor credentials? OpenAI is also a Mercor customer. So is Meta, which has reportedly paused its Mercor contracts pending forensic review. Multiple class-action lawsuits — Gill v. Mercor.io, Deboni v. Mercor, Esson v. Mercor — are working through the Northern District of California, with at least one naming BerriAI (LiteLLM’s creator) and Delve Technologies as co-defendants. The contractor PII exposure alone is going to drive years of litigation. The model-environment exposure may drive more.

Two: what does Anthropic’s response to this look like? The company has the most committed AI safety messaging in the industry. Project Glasswing is a credible attempt to govern dual-use capability through structural firebreaks. But the Mythos breach happened in spite of all of that, through a vendor-management failure that has nothing to do with model architecture or alignment. The response — credential rotation policies, vendor environment hardening, blast-radius containment for partner-tier access — is going to have to be operational, not philosophical. And it’s going to have to be visible.

Three: what happens when the next Mythos-class model ships, and a dozen Mercors are upstream of it? Anthropic has explicitly said it will build new safeguards into an upcoming Claude Opus model before any broader deployment of Mythos-class capability. That’s the right order of operations. But the lesson of this incident isn’t model-side — it’s contractor-side. The same coalition structure, the same vendor sprawl, the same procurement-velocity-over-security-rigor pattern that made Mercor a soft target makes the next vendor a soft target too. Until the AI supply chain treats credential lifecycle and naming convention discipline as security-critical rather than ops-hygiene, every frontier model launch is going to ship with this same kind of latent exposure baked in.

The viral tweet ended with “they’re inside.” Three words.

It’s correct. The thing it leaves out is that “inside” was made of about six different organizations, none of which were named Anthropic, and the breach happened mostly through doors that other people forgot to close.

That’s the actual story. And it’s the one CISOs should be sitting their boards down to read.


Further reading on Breached.Company:

On CISO Marketplace:


Breached.Company covers data breaches, supply chain compromises, and incident response analysis for the security community. If you’re responsible for vendor risk management, AI supply chain security, or third-party access governance and need help building the controls this breach made obvious, CISO Marketplace connects you with vetted advisors and tooling.