“For the first time, GTIG has identified a threat actor using a zero-day exploit that we believe was developed with AI.” That sentence, lifted from Google Threat Intelligence Group’s (GTIG) May 2026 findings, captures the report’s blunt assertion: generative models are no longer experimental tools in adversary toolkits — they are a force multiplier across exploit development, autonomous malware, supply chain subversion, and scalable access to premium LLM services.
Zero-day exploit linked to AI-assisted development
GTIG describes a cyber crime campaign in which collaborators planned a mass exploitation using a zero-day embedded in a Python script that bypassed two‑factor authentication (2FA) on a widely used open-source, web-based system administration tool. GTIG worked with the affected vendor to responsibly disclose the flaw and says the group likely leveraged an AI model to discover and weaponize the vulnerability. The script included abundant educational docstrings, a hallucinated CVSS score, and a textbook Pythonic structure — artifacts GTIG links to AI model outputs. GTIG notes the flaw was a high‑level semantic logic error (a hardcoded trust assumption) that frontier LLMs are increasingly able to surface, even when traditional fuzzers and static analysis miss such issues.
PROMPTSPY: an autonomous, Gemini‑integrated Android backdoor
GTIG’s analysis of PROMPTSPY — first publicly identified by ESET — finds the Android backdoor implements an autonomous agent, “GeminiAutomationAgent,” that serializes the device’s UI via the Accessibility API, sends it to the gemini-2.5-flash-lite model, and receives structured JSON responses instructing precise UI actions (CLICK, SWIPE) with spatial coordinates. PROMPTSPY’s design includes anti‑hallucination “Core Judgment Rules,” a benign persona to bypass filters, and the ability to capture biometric gestures for replay. For persistence, PROMPTSPY can overlay and neutralize an uninstall button, use Firebase Cloud Messaging to relaunch, and dynamically rotate C2 infrastructure — including Gemini API keys — at runtime. Google reports no PROMPTSPY apps on Google Play and says Play Protect defends known variants; Google also disabled assets associated with this activity.
AI‑augmented obfuscation and polymorphic tooling
GTIG catalogs multiple malware families that leverage LLMs for evasion: PROMPTFLUX (dynamic code modification), HONESTCUE (just‑in‑time VBScript obfuscation via Gemini), and decoy‑heavy families CANFAIL and LONGSTREAM. Analysis of CANFAIL shows LLM‑generated developer comments describing inert, filler blocks; LONGSTREAM includes repeated, benign‑looking queries (32 duplicate daylight‑saving checks) to camouflage downloader behavior. GTIG also observed PRC‑nexus actor APT27 use Gemini to accelerate development of a fleet management tool for an operational relay box (ORB) network — the tool contains a hardcoded maxHops parameter set to 3 and lists MOBILE_WIFI and ROUTER device types, suggesting multi‑hop residential IP relay capabilities.
Supply chain compromises: TeamPCP (UNC6780) and AI tooling dependencies
In late March 2026 GTIG and Mandiant link the cyber crime actor TeamPCP (aka UNC6780) to supply chain compromises of multiple GitHub repositories and GitHub Actions, including components tied to Trivy, Checkmarx, LiteLLM, and BerriAI. TeamPCP obtained initial access via compromised PyPI packages and malicious pull requests, then embedded the SANDCLOCK credential stealer to extract AWS keys and GitHub tokens from build environments. GTIG highlights LiteLLM — an AI gateway utility for integrating LLM providers — as a high‑impact case: compromised AI dependencies can expose API secrets and enable attackers to leverage internal AI systems to scale reconnaissance, exfiltration, or pivot into broader infrastructure.
Obfuscated, scalable access to premium LLMs: account pooling and proxy tooling
GTIG documents an emerging underground ecosystem that industrializes LLM abuse: API gateways like Claude‑Relay‑Service, CLIProxyAPI, account‑provisioning scripts that automate registration and cancellation (bypassing CAPTCHA and SMS), anti‑detect browsers (Roxy Browser), and management centers that aggregate keys. GTIG observed PRC‑nexus cluster UNC6201 using a public Python script to automate premium account registration and immediate cancellation; UNC5673 also leveraged account‑pooling tools and targeted government sectors in South and Southeast Asia. GTIG recommends LLM providers analyze network signals from API aggregators to enable disruption and account action — a technical mitigation grounded in the observed tooling.
What this means for technologists, policymakers, and open‑source maintainers
- Technologists and security teams: expect AI to augment both discovery and evasion. GTIG’s findings show LLMs surfacing semantic logic bugs and simultaneously supporting polymorphic, agentic malware like PROMPTSPY — defenders must monitor AI‑related telemetry and control API key sprawl.
- Policymakers and regulators: supply chain incidents tied to LiteLLM and GitHub repositories demonstrate that AI‑adjacent dependencies raise systemic risk; oversight and guidance for secure AI development — such as the Secure AI Framework (SAIF) referenced by GTIG — will be relevant to procurement and critical‑systems policy.
- Open‑source maintainers and marketplaces: the OpenClaw/VirusTotal example — automated scanning integrated into ClawHub — and GitHub compromises by TeamPCP underscore the need for embedded code scanning, provenance controls, and build‑time secret hygiene in public repositories and package registries.
GTIG’s report paints a landscape of rapid adaptation: adversaries are using LLMs to find nuanced logic flaws, generate obfuscation and decoy code, automate account‑scale access to premium models, and embed autonomy into malware. Google counters with tools and processes — Big Sleep for vulnerability discovery, CodeMender for automated patching, SAIF guidance, and marketplace scanning partnerships — but the report makes plain that defenders and maintainers must move at AI speed to match adversaries exploiting the same capabilities.




