OpenAI's GPT-5.5 Matches Mythos in Security Vulnerability Detection

"The UK’s AI Security Institute evaluated GPT-5.5’s ability to find security vulnerabilities, and found that it is comparable to Claude Mythos."

What the Institute evaluated

The UK’s AI Security Institute carried out an evaluation of GPT-5.5’s ability to find security vulnerabilities and concluded that the OpenAI model is comparable to Claude Mythos in that task. The Institute’s work is presented as a direct comparison between models’ performance at identifying vulnerabilities; the source notes the result without supplying further technical detail in the text provided here.

GPT-5.5’s availability and why that matters

Noting the practical context, the source explicitly states that the OpenAI model — GPT-5.5 — is generally available. That availability, coupled with the Institute’s finding of parity with Mythos, means a capability judged effective for vulnerability discovery is not confined to a research preview or restricted beta: it exists as an accessible tool in the market today.

How GPT-5.5 compares to Claude Mythos

The headline conclusion from the Institute is parity: GPT-5.5 is "comparable to Claude Mythos" in finding security vulnerabilities. The source points readers to the Institute’s evaluation of Mythos for the underlying data and methods; within the short summary here, the comparison is stated as the principal finding rather than detailed metrics or case examples.

Smaller, cheaper model analysis and prompt scaffolding

The source also references an analysis of a smaller, cheaper model. That analysis, according to the source, found that the less capable model "requires more scaffolding from the prompter, but it is also just as good." In other words, with additional prompting effort and guidance from a human operator, a smaller model achieved comparable results to the larger models in the Institute’s assessment.

What this means for technologists, procurement leaders, and adversaries

Technologists and security teams: The Institute’s evaluation establishes a new baseline for automated vulnerability discovery by naming GPT-5.5 and Claude Mythos as comparable tools. Teams will likely evaluate access, workflow integration, and how much human prompt engineering is required—especially given the note that smaller models can match performance when given more scaffolding.
Procurement leaders and buyers: The fact that GPT-5.5 is "generally available" is a procurement detail that materially affects purchasing choices and deployment timelines. Buyers can now weigh licensing and operational costs against the Institute’s finding of comparable capability to Mythos.
Adversaries and threat actors: The source does not elaborate on intent, but the stated availability of an effective vulnerability-discovery model and the report that smaller models can reach similar performance with more human guidance highlight that capability is not confined to top-tier platforms alone.

The Institute’s short summary places three concrete facts on the table: an evaluation was performed by the UK’s AI Security Institute; GPT-5.5 was judged comparable to Claude Mythos at finding security vulnerabilities; and GPT-5.5 is generally available, while a smaller, cheaper model can match performance with additional prompt scaffolding. Those elements—evaluation, parity, availability, and the role of prompt engineering—are the durable pieces that security teams, buyers, and analysts will have to reconcile as they decide how, or whether, to incorporate these tools into defensive processes.

Source: OpenAI’s GPT-5.5 is as Good as Mythos at Finding Security Vulnerabilities — Bruce Schneier’s blog