North Korea-linked Backdoor Exploits AI Triage Tools

"Anyone building such tooling should treat the contents of the samples they triage as adversarial input, never as instructions, and be prepared to keep hostile content out of the model entirely," SentinelLabs wrote.

The prompt injection that targeted AI-assisted triage

Researchers at SentinelLabs — the research arm of SentinelOne — found a macOS backdoor that hides a cascade of fabricated system messages designed not to fool a sandbox but to manipulate the analyst's AI tools. The implant, tracked as macOS.Gaslight and attributed with high confidence to North Korean activity, carries a Markdown-fenced block containing 38 fake system messages. Those messages mimic the internal scaffolding of an AI triage tool and warn of token expiry, memory and disk errors, repeated failures and bogus injection flaws. SentinelLabs says the clear aim is to push an AI agent toward aborting or refusing its analysis.

macOS.Gaslight: a backdoor and stealer built in Rust

Behind the injection is a full-featured infostealer and backdoor implemented in Rust. SentinelLabs reports the implant provides an operator with an interactive shell and a broad data-collection capability: it can extract browser data from Chrome, Brave, Firefox and Safari, pull terminal histories, enumerate installed applications and retrieve a copy of the macOS login keychain. Much of the collection flow can be run through a Python module the malware stages on demand.

Evasion techniques: Telegram Bot API, certificate pinning, and token scrubbing

The malware's command channel uses Telegram's Bot API, with traffic encrypted and protected by certificate pinning to frustrate network inspection. SentinelLabs highlighted two novel touches: the implant can fetch a standalone Python interpreter from a public open-source project at runtime, and it is built to scrub its Telegram bot token from logs or crash output, removing a common detection clue defenders use to trace and block malicious bots. Apple’s XProtect also flagged the file under a signature family SentinelLabs has tied to North Korean operators — a link the researchers used in their attribution analysis.

Stacked injections and the evolution of adversarial samples

Prompt-injection tricks against analyst tooling are not entirely new, the researchers note. Earlier versions relied on a single injected block; SentinelLabs cites prior work by Check Point and others dating back to 2025. What sets macOS.Gaslight apart is the stacking of 38 fabricated messages into a cascade — an escalation intended to increase the chance that an AI-assisted pipeline will misinterpret or reject the sample. SentinelLabs flagged this escalation as the part of the implant's tradecraft that “stood out,” even as most other techniques were familiar.

What this means for malware analysts, enterprise defenders, and open-source maintainers

Malware analysts and AI-assisted triage tool builders: SentinelLabs' central admonition applies directly: treat the contents of samples as adversarial input and never as instructions. The macOS.Gaslight sample demonstrates that hostile actors will target the tooling — not only the environment — and that analysis pipelines should be designed to strip or isolate model-facing content.
Enterprise defenders and incident responders: The combination of Telegram Bot API usage, certificate pinning, and token-scrubbing removes common forensic breadcrumbs. Defenders will need to watch for behavioral indicators — the data-collection patterns for browsers, terminal histories and the macOS keychain — rather than relying solely on network signatures or exposed bot tokens.
Open-source maintainers: The implant's ability to pull a standalone Python interpreter from a public project at runtime raises a risk that legitimate distributions can be requisitioned by malware as runtime dependencies. Maintainers and downstream users should be aware of how publicly hosted artifacts might be used by threat actors.

SentinelLabs closed with a pointed forecast: "As LLM-assisted analysis becomes routine, defenders should expect more samples built to exploit it." The macOS.Gaslight case is an early, explicit reminder that attackers will adapt their tradecraft to the tools defenders adopt — and that defenses must treat analysis inputs as potentially hostile rather than merely informational.

Original story