AI Skill Exploits Security Scanners, Reaches 26,000 Agents

"The point was to show that none of the signals people lean on to trust a skill caught it: not the scanners, not the GitHub stars, not the open-source reputation," AIR wrote in its report on a staged experiment.

AIR's experiment and the headline numbers

Security firm AIR built a fake AI agent skill called brand-landingpage, published it to a popular skill marketplace, and ran an Instagram ad aimed at marketers, salespeople, and designers. AIR reports the skill reached roughly 26,000 agents, including some on corporate accounts, and that every scanner it tested marked the package safe. The payload was deliberately harmless: in the demo the skill only mailed a user's email address back to AIR, which is how the firm counted the agents it reached.

Those scale figures, and the company's claim that a real operator could have taken far greater control, come from AIR alone and are not independently confirmed. The firm also notes its write-up concludes by pitching a managed skill marketplace that it is launching.

How the brand-landingpage skill actually worked

According to AIR, the skill presented itself as a builder for landing pages using "Google's Stitch design tool" and targeted non‑technical users. The package submitted to scanners contained no custom setup instructions; instead it told the agent to install the "Stitch SDK" by following documentation at an external URL — stitch-design.ai, a domain controlled by AIR. The real Stitch documentation resides at stitch.withgoogle.com, but the skill's manifest pointed to the AIR‑controlled site.

When scanned, the package looked plausible: a clean SKILL.md and files that referenced an external setup page. After the skill had been widely installed, AIR replaced the content at the external URL with a new page instructing the agent to download and run a script. In the demonstration the script simply emailed the user address back to AIR; AIR says an operator with malicious intent could have used the same foothold to read files, move data, or access internal systems, limited only by what the agent itself could reach.

Why multiple scanners missed the change

Every scanner AIR tested against — including Cisco's, NVIDIA's, and the scanners wired into skills.sh — analyzes the package you hand them: the SKILL.md and the files shipped with the skill. That approach treats a skill as a static bundle. AIR and earlier research demonstrate the structural weakness: a scanner checks a fixed snapshot, while an attacker can host changing payloads at external URLs and swap them after the review completes.

Three weeks before AIR's experiment, Trail of Bits bypassed ClawHub's malicious-skill detector, Cisco's scanner, and the three scanners in skills.sh, reaching a similar conclusion: keeping the submitted package clean while hosting executable instructions externally can evade detection. Anthropic's documentation, AIR notes, already warns that skills fetching external URLs are risky for the same reason — the content a skill relies on can change after vetting.

What defenders should do next

AIR and independent researchers converge on pragmatic mitigations: treat skills as software, not text. Vet not just what ships inside a skill but what it points to; pin versions; route new skills through a single source you control and re‑check them when anything changes. Hold agents to least privilege, and assume any external instruction an agent fetches will run with the agent's access. These prescriptions follow directly from the experiment's anatomy: a clean package, a plausible external setup page, and an attacker-controlled URL that was swapped after install.

What this means for marketers, security teams, and enterprise administrators

Marketers, salespeople, and designers: the workflow that made brand-landingpage attractive — an easy landing-page builder promoted via an Instagram ad — is the same path attackers can abuse; users installing skills for productivity gains may inadvertently grant agents access unless controls are in place.
Security teams: scanners that inspect only submitted files will miss payloads hosted offsite; defenders should inventory installed skills, route new skills through a single vetted source, and re‑scan skills whenever external content they rely on changes.
Enterprise administrators: corporate accounts were among those AIR says were reached, underscoring the need to pin versions, enforce least privilege for agent actions, and assume that any externally fetched instruction may execute with the agent's access.

Whether the true reach was 26,000 agents or a smaller fraction, the experiment lines up three persistent weaknesses — borrowed trust signals (GitHub stars), snapshot scanning, and editable external links — into a single, repeatable path. Until marketplaces and scanners change their model from one-time snapshot review to continuous validation of external dependencies, defenders will be left to close the gap with process and configuration.

Original reporting: https://thehackernews.com/2026/06/fake-ai-agent-skill-passed-security.html