OpenAI Bolsters ChatGPT Security With New Controls

"This looks really good to me," wrote cybersecurity expert and open-source developer Simon Willison in his blog over the weekend.

Lockdown Mode: an optional, outbound-focused control

OpenAI has introduced Lockdown Mode as an optional setting for ChatGPT that limits how far the model can reach into the web and external services. First offered to enterprise plans in February, Lockdown Mode began reaching personal and self-serve business accounts in early June. OpenAI framed the feature for users and organizations that handle sensitive data rather than the general public.

Blocking the exfiltration channel, not the malicious prompt

Lockdown Mode does not prevent malicious or hidden instructions from reaching the model; researchers, OpenAI noted, have repeatedly shown how a single hidden instruction can pull data from a linked inbox or leak a user's conversations. Instead, the setting targets the last step attackers rely on: outbound network requests. By choking off those outbound requests, Lockdown Mode aims to sever an attacker's route for exfiltrating data — a defense Simon Willison described as a practical approach. OpenAI implements the protection through deterministic controls that a manipulated model cannot override, but Willison added the feature's existence also implies default ChatGPT cannot fully block a determined exfiltration attempt.

Trade-offs: what features pause under Lockdown Mode

The protection carries an operational cost. Live connector access and write actions switch off when Lockdown Mode is enabled, sidelining features explicitly named by OpenAI such as the Finances tool and shopping agents. Lockdown Mode also cannot run alongside Developer Mode. OpenAI positioned the control toward users and organizations handling sensitive data, who may be willing to accept restricted functionality in exchange for a reduced exfiltration risk.

Active Sessions: session auditing and its limits

Alongside Lockdown Mode, OpenAI added Active Sessions to ChatGPT's security settings, giving users the ability to audit where their account is logged in. Each session entry can show device or browser details, an approximate location and sign-in time, which first-party app was used (for example, ChatGPT or Codex), and whether the device is trusted or represents the current session. Users can end a single session or sign out everywhere at once; OpenAI warns that a full sweep can take up to 30 minutes. If an entry looks unfamiliar, OpenAI advises changing the password, reviewing sign-in methods, and contacting support.

Service constraints: SSO accounts and third-party sessions

Active Sessions has notable gaps for larger organizations. The feature is unavailable on accounts that use single sign-on (SSO), including SAML and OpenID Connect. It also does not track third-party app sessions or Codex CLI logins, leaving those access paths outside the new session audit capability.

What this means for technologists, procurement leaders, and end users

Technologists and security teams: Lockdown Mode offers a deterministic control that severs outbound requests, a practical mitigation against prompt-injection exfiltration according to Simon Willison's assessment. Teams will have to weigh that protection against losing live connectors and write actions — forgoing tools such as the Finances tool and shopping agents — and the inability to combine Lockdown Mode with Developer Mode.
Procurement and organizational leaders: OpenAI pitched Lockdown Mode toward users and organizations that handle sensitive data. Procurement decisions will hinge on whether the trade-offs in functionality are acceptable for risk reduction, and whether accounts relying on SSO should seek alternative monitoring or compensate for Active Sessions' unavailability on those accounts.
End users: Active Sessions gives individuals a way to audit sign-ins and to terminate sessions, but the feature does not cover SSO-based accounts, third-party app sessions, or Codex CLI logins. Users who see unfamiliar sessions are directed to change passwords, review sign-in methods, and contact support.

OpenAI's changes acknowledge a clear, demonstrated attack vector — hidden or injected instructions that can lead to data leakage — and respond by cutting the exfiltration route rather than attempting to parse or sanitize each malicious prompt. That design choice forces a familiar security calculus: reduce capability to reduce risk. How widely organizations accept those limits, and how they fill the visibility gaps left by SSO and third-party access, will determine whether the controls materially change the threat landscape for sensitive ChatGPT users.

Original story