Anthropic's AI Model Exposes New Vulnerability Risks

“Anthropic announced that its new model, Claude Mythos Preview, can autonomously find and weaponize software vulnerabilities, turning them into working exploits without expert guidance.” Two weeks ago the company made that claim, and the security community is still sorting what it means.

Anthropic’s claim and the limited release

Anthropic said its Claude Mythos Preview can locate and weaponize vulnerabilities in major software — including operating systems and internet infrastructure — that thousands of developers working on those systems failed to find. The announcement, the authors write, “rocked the internet security community.” Anthropic is not releasing Mythos to the general public; instead it is making the model available to a limited number of companies. The company provided few technical details, a move that angered many observers; some speculate the release limits are motivated by GPU constraints, while others view the restriction as consistent with an AI safety posture.

Why the authors call this an incremental but important step

The essay’s authors describe Mythos as “a real but incremental step, one in a long line of incremental steps.” They emphasize shifting baseline syndrome — the tendency for people to discount large changes that happen via many small steps. The authors note that while today’s large language models excel at finding vulnerabilities in source code, models from five years ago could not. Even if similar vulnerabilities might have been found by models last year, the broader point is that the baseline has shifted: capability that once would have been out of reach is now within view.

Which systems are most exposed: patchable, unpatchable, and hard-to-verify

The authors lay out a simple taxonomy to clarify risk. Some systems are patchable and easy to verify — for example, “generic cloud-hosted web applications built on standard software stacks, where updates can be deployed quickly.” Others are hard or impossible to patch: IoT appliances and industrial equipment that are rarely updated or cannot be easily modified fall into this category. A third class includes distributed cloud platforms and complex systems composed of thousands of interacting services; their complexity makes it difficult to separate real vulnerabilities from false positives and to reproduce exploits reliably.

The practical implication is that offense-versus-defense is not uniform: “Some vulnerabilities can be found, verified, and patched automatically,” the authors write, while others will remain stubbornly exposed because they are hard to patch or hard to verify.

Defensive practices the authors expect to matter

The essay recommends returning to, and extending, foundational security practices rather than abandoning them. For unpatchable or hard-to-verify devices, the authors argue these should be “protected by wrapping them in more restrictive, tightly controlled layers” — for example, placing fridges, thermostats, and industrial control systems behind restrictive and constantly updated firewalls. For complex distributed systems, the authors emphasize traceability and the principle of least privilege so each component has only the access it needs.

They also foresee defensive AI agents becoming standard: automated, continuous testing — “VulnOps” — using defensive agents to exercise exploits against a real stack repeatedly until false positives are eliminated and fixes are confirmed. Documentation gains value because it guides AI agents on bug-finding missions in the same way it guides developers. And the use of standard practices, tools, and libraries helps both humans and AI recognize patterns, even as “instant software” — code generated and deployed on demand — becomes more common.

What this means for technologists, enterprises, and everyday users

Technologists and security teams: Expect to adopt continuous, AI-driven vulnerability testing (VulnOps) and to enforce least-privilege architectures and traceability in distributed systems so verification is possible even when AI can find more bugs.
Enterprises and procurement leaders: The authors flag that many connected legacy systems — “cars, electrical transformers, fridges, and lampposts” — and legacy banking and airline systems are networked and will not be patched as quickly. These organizations should anticipate a period of elevated exploitation until verification and continuous patching become routine.
End users and the public: Phones, web browsers, and major internet services are among the systems the authors believe will ultimately favor defense because they are easy to patch; but consumers should be aware that many everyday connected devices and infrastructure elements may lag and remain vulnerable for years.

The essay closes with a sober forecast: we may endure “a few years of constant hacks until we arrive at a new normal” in which verification is paramount and software is patched continuously. The piece was written with Barath Raghavan and originally appeared in IEEE Spectrum.

Read the original post: https://www.schneier.com/blog/archives/2026/04/what-anthropics-mythos-means-for-the-future-of-cybersecurity.html