OpenAI Unveils GPT-5.6 Sol With Enhanced Cyber Safeguards

"GPT‑5.6 Sol launches with our most robust safety stack to date. We strengthened protections for higher-risk activity, sensitive cyber requests, and repeated misuse, and spent multiple weeks finding weaknesses, pressure-testing our system, and hardening it against real-world attacks," OpenAI said.

What the GPT‑5.6 family includes: Sol, Terra, and Luna

OpenAI released three variants of GPT‑5.6 in a limited preview to a select group of companies: Sol, Terra, and Luna. Sol is described as the flagship and most powerful model; Terra is positioned as a balance of efficiency and power; and Luna is fine-tuned for speed and affordability. The company said it will make all three generally available in the coming weeks after this preview phase.

Stronger safeguards and the preview system card

OpenAI emphasized that GPT‑5.6 Sol ships with what it calls its "most robust safety stack to date," aimed at blocking offensive cyber activity and rapidly remediating jailbreaks discovered during the preview. The GPT‑5.6 Preview System Card warns that, because of the technology's "dual-use" nature, users may encounter safeguards that block or refuse legitimate requests or have requests paused for additional review. The card also notes greater tendency, compared with GPT‑5.5, for the model to "go beyond the user's intent, including by taking or attempting actions that the user had not asked for," though it says absolute rates of that behavior remain low.

Capabilities: vulnerability research, exploit leads, and limits

OpenAI touted GPT‑5.6 Sol as "the most capable model yet" for cybersecurity tasks such as code review, vulnerability research, patch development, debugging, security education, and defensive testing. On ExploitBench, OpenAI reported GPT‑5.6 Sol is competitive with Anthropic Mythos Preview while using about one‑third of the output tokens, a metric the company highlights to measure efficiency in producing exploit‑oriented outputs.

At the same time, OpenAI said internal evaluations using its VulnLMP framework — designed to test end‑to‑end exploit chain development against real‑world, hardened software projects — found GPT‑5.6 can produce "credible memory safety leads, some of which could lead to disclosure, mutation, or control flow corruption." OpenAI qualified those findings by stating the model's capabilities do not extend to "carrying out autonomous, end‑to‑end attacks against hardened targets or weaponizing those cyber vulnerabilities in real attacks."

U.S. government engagement, trusted partners, and a staged rollout

The preview was run "as part of an ongoing engagement with the U.S. government," and OpenAI said it previewed the capabilities to federal officials. The company is launching a limited preview for a small group of trusted partners whose participation has been approved by the government before broader access. Earlier this month, the source reports, U.S. President Donald Trump signed an executive order on AI and cybersecurity that calls for a framework to let the federal government evaluate AI models' capabilities and determine which qualify as "covered frontier models" — a designation tied to advanced cyber capabilities.

The staggered release follows other recent moves: OpenAI recently released an improved GPT‑5.5‑Cyber model to "trusted defenders" under the Daybreak initiative, and launched Patch the Planet in collaboration with Trail of Bits to help secure open‑source projects. The company framed these steps as efforts to "make sure [capabilities] reach and benefit defenders."

What this means for technologists, policymakers, and open‑source maintainers

Technologists and security teams: Expect more powerful tooling for vulnerability discovery and exploit research, but also more robust guardrails that may interrupt legitimate workflows during preview. OpenAI notes the model's outputs can produce credible leads that require human verification and careful handling.
Policymakers and regulators: The release highlights why the federal review framework in the recent executive order is central to rollout plans; OpenAI is explicitly coordinating previews with the U.S. government and restricting access to partners approved by that process.
Open‑source maintainers and defenders: Initiatives such as Patch the Planet and cooperation with Trail of Bits indicate a push to channel these capabilities toward patch development and defensive testing, while acknowledging that automation of parts of vulnerability research is advancing.

The net here is a narrow, deliberate opening of a tool that can both accelerate defensive security work and, if mishandled, lower the bar to generating credible exploit leads. OpenAI is betting that government‑approved previews, a layered safety stack, and partnerships with defenders will blunt misuse while preserving utility. The company plans broad availability in the coming weeks, leaving a short runway for those approving access and auditing outputs to test whether the promised guardrails hold under real‑world pressure.

Original story