Claude AI Extension Flaw Enables Cross-Plugin Hijacking

"A flaw stems from an instruction in the extension’s code that allows any script running in the origin browser to communicate with Claude’s LLM, but does not verify who is running the script." — Aviad Gispan, LayerX

What LayerX found in the Claude Chrome extension

Browser security firm LayerX reported a vulnerability in Anthropic’s Claude Chrome extension that lets any other browser extension — including those that do not request special permissions — issue commands to Claude by invoking a content script. LayerX senior researcher Aviad Gispan wrote that the extension “does not verify who is running the script,” creating a channel for arbitrary scripts to communicate with the model.

LayerX described the bug as creating “a privilege escalation primitive across extensions, something Chrome’s security model is explicitly designed to prevent,” saying the flaw “effectively breaks Chrome’s extension security.”

How attackers could weaponize the extension

The researchers showed the exploit works in two linked ways. First, any extension can call a content script in the origin browser to issue prompts to Claude because the extension’s code accepts communication from any script. Second, Claude’s decision process depends on input it can read — text, user interface (UI) semantics and screenshots — all of which an attacker can control.

LayerX modified Claude’s UI to remove or alter labels and indicators around sensitive items (for example, passwords or sharing feedback) and then prompted Claude to take privileged actions. Those manipulations allowed the agent to perform actions that looked legitimate from inside the altered UI and to evade detection by defenders who monitor prompts alone. Where actions were visible, LayerX said the model could be instructed to delete emails and other evidence of its actions.

Proof-of-concept outcomes: data accessed and controls taken

In its proof of concept, LayerX reported it was able to:

Extract files from Google Drive folders and share them with unauthorized parties.
Surveil recent email activity and send emails on behalf of a user.
Pilfer private source code from a connected GitHub repository.

Those actions demonstrate cross-site behavior across multiple Google tools and connected services, the researchers said, and were possible without special extension permissions because content scripts can be invoked by extensions that request none.

Anthropic’s response and the partial mitigation

LayerX reported it disclosed the vulnerability to Anthropic on April 27. According to LayerX, Anthropic replied the next day saying the bug was a duplicate of another vulnerability already being addressed in a future update. Anthropic issued a change on May 6 that “introduced new approval flows for privileged actions that made it harder to exploit the same flaw,” LayerX said.

Despite those changes, LayerX said it could still take over Claude’s agent in some scenarios. Gispan wrote that “switching to ‘privileged’ mode, even without the user’s notification or consent, enabled circumventing these security checks and injecting prompts into the Claude extension, as before.” Anthropic did not respond to a request for comment from CyberScoop on the research and mitigation efforts.

What this means for technologists, enterprises, and cybersecurity defenders

Technologists and security teams: The report highlights an attack vector that bypasses Chrome’s normal extension permission boundaries by leveraging content scripts and a permissive in-extension communication channel. Teams responsible for extensions or agent integrations will need to scrutinize how the agent accepts input from the page and validate the origin of scripts invoking the extension.

Affected enterprises and procurement leaders (businesses and governments): Businesses and governments that deploy Claude or similar agent-based browser extensions should reassess the trust placed in browser-side integrations and connected repositories and cloud drives. LayerX’s POC showed files in Google Drive and private GitHub repositories could be exposed and that email can be read and sent from a user account.

Cybersecurity defenders: The vulnerability underlines Ax Sharma’s point from Manifold Security: monitoring at the prompt layer alone can be insufficient. Sharma called the demonstration “a useful demonstration of why monitoring AI agents at the prompt layer is fundamentally insufficient,” and warned that the core risk was manipulation of the agent’s perceived environment so actions appear legitimate from the inside.

Closing observation

LayerX’s findings portray a class of risk that combines browser extension privilege boundaries with the opaque, environment-driven behavior of an AI agent. The company’s proof-of-concept showed cross-site data exfiltration and action-taking without elevated extension permissions, and a May 6 mitigation still left scenarios where “privileged” mode could be reached without user consent, according to LayerX. The remaining open question is whether the approval flows Anthropic added can be tightened to remove any path for silent privilege escalation, and how defenders will detect agent actions that were produced by a manipulated UI rather than a malicious prompt.

Original CyberScoop story