"AI has changed the economics of exploit development," said Nicole Carignan, senior vice president of security and AI strategy at Darktrace.
Google Threat Intelligence Group: AI-assisted exploit development
Google's Threat Intelligence Group (GTIG) disclosed forensic evidence that a criminal campaign earlier this year deployed a working zero-day whose exploit code was generated with high confidence by an artificial intelligence model. GTIG said the actor "likely leveraged an AI model to support the discovery and weaponization," meaning the model appears to have performed much of the technical work of finding and coding the exploit while humans directed the broader campaign. Google also said it does not believe its own Gemini model was involved in building that particular exploit.
The zero-day: a Python two-factor bypass in an open-source web admin tool
The uncovered vulnerability was a semantic logic flaw: a bypass of two-factor authentication embedded in a Python script that targeted a popular open-source web administration tool. Google worked with the affected vendor to patch the flaw before the mass attack could be launched, though the vendor and tool were not named.
Researchers emphasized why this case was notable: the bug did not stem from memory corruption or improper input handling — classes of error that conventional scanners are built to detect. Instead, the vulnerability arose from a high-level hardcoded trust assumption in application logic. GTIG said the exploit script contained abundant educational comments, a fabricated severity score and a structured, textbook-style Python format that GTIG assessed as highly characteristic of training data used to build large language models.
North Korea's APT45 and China-linked UNC2814: automated probing and coaxing models
The report documents systematic use of AI across multiple offensive operations. Google attributed thousands of automated, repetitive prompts to North Korea's APT45; those prompts were used to systematically analyze known software flaws, validate working exploits and assemble an arsenal of capabilities that would have been impractical to compile without AI. Separately, a China-linked group tracked as UNC2814 attempted to manipulate Gemini by instructing it to "assume the role of a network security expert specializing in embedded devices" — a technique intended to coax the model into providing vulnerability research it would otherwise decline to assist with.
Attackers also experimented with a GitHub repository called "wooyun-legacy" built as an AI code skill plugin that integrated more than 85,000 historical vulnerability cases collected by the Chinese bug bounty platform WooYun (2010–2016). By feeding that distilled knowledge base into models within a single session, the operators effectively trained the model to prioritize the kinds of logic flaws an otherwise general-purpose model might overlook.
Russia-linked CanFail and LongStream: AI used to hide malware
Google identified two Russia-linked malware families deployed against Ukrainian targets that used AI-generated filler code to mask malicious intent. In LongStream, researchers found 32 separate instances of code that repetitively checked the system's daylight saving status — inert queries with no functional purpose but useful as noise to make the malicious code harder to spot. The campaigns show a divergent pattern of AI use: where some actors use models to discover and weaponize vulnerabilities, others use them to obfuscate and evade detection.
PromptSpy, Gemini, and autonomous control of infected devices
Google expanded the technical picture of an Android backdoor known as PromptSpy, first identified by cybersecurity firm Eset. PromptSpy uses Google's Gemini API to control infected devices without human direction. According to Google's analysis, the malware contains an autonomous module that maps the visible layout of a device's screen, sends that layout to Gemini, and receives back precise coordinates and gesture instructions — clicks and swipes — which it then executes to navigate the phone on the attacker's behalf.
PromptSpy can capture biometric login data, such as fingerprint patterns or PIN sequences, and can place an invisible overlay over an uninstall button so a victim's tap is intercepted and the uninstall appears not to work. Its command infrastructure, including Gemini API keys and relay servers, can be updated remotely without redeploying the malware. Google said it disabled the assets associated with PromptSpy and that no apps containing it are on the Google Play Store.
What this means for technologists, policymakers, and end users
- Technologists and security teams: the report spotlights logic- and trust-assumption flaws that conventional scanners miss, and shows actors training or priming models with large vulnerability corpora (for example, the "wooyun-legacy" repository) to find and weaponize such flaws at scale.
- Policymakers and regulators: the ecosystem of proxy relay services, pooled accounts and automated registration pipelines—documented in a March 2026 CISPA Helmholtz study that identified 17 shadow API services—gives attackers persistent, inexpensive access to powerful models and raises data-exfiltration and accountability concerns for model providers and intermediaries.
- End users and enterprise defenders: mobile threats like PromptSpy demonstrate how AI-enabled automation can convert a single compromised device into an autonomously controlled asset capable of bypassing uninstall attempts and harvesting biometric data, underscoring practical escalation risks even when code-signals or crash-based detectors show nothing.
Google's findings present a blunt contrast: attackers are using AI both to lower the technical barriers to discovering exploitable logic flaws and to hide malicious code inside benign-appearing routines, while defenders still rely heavily on tools tuned to traditional errors. As Darktrace's Nicole Carignan put it, AI "industrializes what was previously a high-skill, time-intensive process," making exploit development faster, more repeatable and accessible to a broader range of actors. The central question left by this disclosure is whether defenders and policy frameworks can adapt fast enough to the new economics of exploit development and the shadowed infrastructure that supplies it.




