AI Coding Agents Exposed to Agentjacking Attack

"The attacker never touches the victim's infrastructure," Ron Bobrov, Barak Sternberg, and Nevo Poran of Tenet Security warned — a crisp summation of a new, data-only attack that researchers say can trick AI coding assistants into executing attacker-controlled code on developers' machines.

How Agentjacking works, step by step

Tenet Security has named the technique "Agentjacking" and published a clear attack chain that leverages Sentry, an open-source error-tracking and performance-monitoring platform. The researchers describe the chain in ordered steps:

An attacker locates a target's Sentry Data Source Name (DSN), a public, write-only credential embedded in websites.
Using the DSN, the attacker sends a malicious error event to Sentry's ingest endpoint via a POST request.
The injected event includes "carefully formatted markdown" placed in the message field and context key names so that, when returned by Sentry's MCP server, it is rendered as structured content visually identical to Sentry's system template.
When a developer tells an AI coding agent to "fix unresolved Sentry issues" (or a similar prompt), the agent queries Sentry via the Model Context Protocol (MCP) and receives the injected event.
The agent interprets the attacker's content as legitimate diagnostic resolution steps and executes the attacker-controlled code with the developer's privileges.

Why Sentry's architecture matters

The researchers say the root cause is an architectural gap at the intersection of Sentry's event ingestion — which "accepts arbitrary payloads from anyone with the DSN" — and the Sentry MCP server, which returns that data to AI agents as trusted system output. "The attack exploits a critical architectural flaw at the intersection of Sentry's event ingestion (which accepts arbitrary payloads from anyone with the DSN) and the Sentry MCP server (which returns this data to AI agents as trusted system output)," Ron Bobrov, Barak Sternberg, and Nevo Poran wrote.

Because the AI agent cannot distinguish between an event generated by an actual application crash and one injected by an attacker, the MCP pathway becomes a vector for arbitrary code execution. Tenet emphasizes that the injected content is formatted to appear indistinguishable from normal Sentry guidance, undermining the agent's ability to treat it as untrusted input.

Scope and impact: measurable exposure and test results

Tenet reports it found at least 2,388 organizations exposed with valid, injectable DSNs. In controlled testing against more than 100 organizations, the company says it achieved an 85% exploitation success rate against injected errors "across some of the most widely used AI coding assistants," naming Claude Code and Cursor as examples of agents that can be induced to run attacker-controlled code.

The researchers spell out what a successful attack can disclose and control: environment variables, Git credentials, private repository URLs, and developer identities. Tenet also notes the attack bypasses conventional defensive layers because it involves no malicious binary or network intrusion: "The attack bypasses EDR, WAF, IAM, VPN, Cloudflare, and firewalls - because there is nothing malicious to detect. Every action in the chain is authorized."

Sentry's response and the partial mitigation

According to Tenet's account, Sentry acknowledged the issue but declined a comprehensive fix, stating the problem is "technically not defensible." Instead, Sentry activated a global content filter that blocks a "specific payload string." Tenet describes that step as limited: the researchers emphasize the underlying trust model between MCP and the agent remains unchanged, and that carefully crafted markdown can still be used to disguise malicious instructions as legitimate resolutions.

What this means for technologists, enterprises, and AI coding-agent vendors

Technologists and security teams: Developers who rely on AI agents for diagnosing and fixing errors must treat MCP-returned content as potentially untrusted input. Tenet's findings show an attacker can execute code without touching target infrastructure, meaning conventional perimeter controls may not detect the attack.
Enterprises and procurement leaders: Organizations embedding Sentry DSNs in public assets may have inadvertently exposed writable endpoints; Tenet found thousands of exposed DSNs and high exploitation rates in tests, calling for review of where DSNs are published and how external data is consumed by automation tools.
AI coding-agent vendors and platform owners: Because the attack uses the Model Context Protocol to deliver attacker-crafted events, vendors that integrate MCP feeds into agent contexts will need to reassess how agents validate or sandbox instructions that appear to come from trusted system templates.

Tenet's research reframes a familiar security maxim: trust the source of the instruction, not just its form. In this case, the instructions arrive as ordinary error-resolution guidance, but Tenet's tests show they can be weaponized to run under a developer's own privileges and extract sensitive credentials. Sentry's content-filter response addresses a single payload string while leaving the broader MCP trust model intact — a choice that the researchers describe as declining a full technical fix. The question Tenet leaves visible in its findings is stark and practical: as enterprises push more coding work through automated agents, the agents themselves become an attack surface that can be turned against the very developers who rely on them.

Original story at The Hacker News