Breach Exposes Anthropic's AI Model Vulnerability

"Unauthorized users were able to access Anthropic’s Mythos model, reportedly by just changing a model name," Shane Fry, Chief Technology Officer at RunSafe Security, told reporters.

Anthropic and the Claude Mythos Preview

Anthropic’s Claude Mythos Preview is described in recent reporting as a new AI model with an elevated capability to identify digital vulnerabilities. Because of that capability, access to Mythos was limited to a select group of partners in an initiative called Project Glasswing. New reports indicate an unauthorized group may have gained access to the model despite those limits.

Bloomberg report and private online forum activity

Bloomberg News reported that users in a private online forum accessed Mythos on the same day the company announced a limited release of the model. The public report raised concern among security experts because the model’s ability to surface digital weaknesses is the precise reason access was supposed to be restricted.

Investigation focuses on third-party vendor environments

Anthropic confirmed to CBS News that an investigation is underway into the matter. According to the company, the inquiry is examining the possibility that the access originated from one of its third-party vendor environments, which Anthropic collaborates with to develop its models. At this time, Anthropic has said no breaches outside the vendor environment have been identified and that a compromise of Anthropic systems has not been detected.

RunSafe Security’s warning and the mechanics alleged

Shane Fry framed the reported access as evidence of how small changes can expose otherwise restricted systems. "Even if their intent is just to explore, it shows how easily these systems can be exposed," Fry said, after describing the reported access as achievable "by just changing a model name." He added a broader caution: "The reality is these AI capabilities are already out there, ‘hacked’ or not, and they’re going to accelerate how quickly vulnerabilities are found and exploited."

What this means for technologists, third‑party vendors, and end users

Technologists and security teams: Fry’s assessment directs them to reassess how models are exposed and to "look at how to harden their code so those vulnerabilities can’t be used in the first place." The reported method of access — reportedly a simple model-name change — will be an immediate focus for defensive reviews.
Third‑party vendors and procurement leaders: Anthropic’s statement that the investigation is examining a vendor environment places those partner environments squarely in the spotlight. These vendors will be subject to scrutiny as investigators and clients seek to determine whether access originated in a third‑party system.
End users and the general public: The claim that such AI capabilities are "already out there, ‘hacked’ or not" underscores a reality the public may face — models limited to partner programs can still surface beyond intended boundaries, changing the calculus for how capabilities propagate and how quickly vulnerabilities become known.

Conclusion

The reports leave two concrete threads: a Bloomberg account that users in a private forum accessed Mythos on the day of a limited release, and Anthropic’s confirmation to CBS News that an investigation is under way with a focus on a third‑party vendor environment. Anthropic says it has not identified breaches outside that vendor environment nor detected a compromise of its own systems. Security leaders, partner vendors and the public will be watching the investigation’s findings closely — in particular whether the access stemmed from a simple configuration change, as Shane Fry described, and what corrective steps follow.

Original report: https://www.securitymagazine.com/articles/102251-unauthorized-users-accessed-claude-mythos-new-reports-suggest