PROBABLYPWNED
MalwareJune 25, 20264 min read

North Korean Malware Uses Prompt Injection to Evade AI Analysis

SentinelOne discovers Gaslight, a Rust-based macOS backdoor embedding 38 fake system messages designed to crash or confuse AI-powered malware analysis tools.

James Rivera

Malware authors have long targeted security tools. Now they're targeting the AI assistants that security analysts use to study their code.

TL;DR

  • What happened: SentinelOne identified "Gaslight," a North Korean macOS backdoor that embeds prompt injection payloads targeting AI analysis tools
  • Novel technique: 38 fabricated system messages designed to make AI assistants abort or refuse analysis
  • Attribution: High-confidence link to North Korean threat actors

A New Kind of Evasion

Packers, obfuscators, encrypted strings, anti-debugging checks—malware authors have always tried to slow down analysis. SentinelOne's research on Gaslight reveals a new target: the AI tools analysts increasingly rely on to triage samples.

The Rust-based macOS implant contains a 3.5 KB Markdown-formatted blob of hostile data: 38 fabricated "system" messages wrapped in {{DATA}} tokens. These tokens mimic the prompt scaffolding used by LLM-based triage systems, deliberately blurring the boundary between untrusted sample data and trusted instructions.

The injected messages warn of token expiry, memory errors, disk failures, repeated analysis failures, and bogus injection vulnerabilities. The goal: push an AI agent into aborting its analysis or refusing to produce useful results.

Why This Matters

AI-assisted malware analysis has become standard practice. Analysts feed samples to LLMs for initial triage, function naming, string decryption, and capability summaries. These tools accelerate analysis by handling routine tasks, freeing human analysts for deeper investigation.

Gaslight treats that AI assistant as part of the target environment. Just as malware might detect a sandbox and modify its behavior, Gaslight attempts to detect AI analysis and subvert it. The attack doesn't need to be perfect—if it delays analysis or reduces confidence in AI-generated results, it accomplishes its goal.

This builds on the broader trend of AI systems becoming attack targets. The Microsoft 365 Copilot SearchLeak vulnerability showed how AI tools can be manipulated to exfiltrate data. Gaslight demonstrates the reverse: adversarial input designed to degrade AI tool performance.

Technical Capabilities

Beyond the prompt injection novelty, Gaslight packs substantial capability into a single Rust binary:

  • Credential and session-data stealer targeting browser cookies, keychain items, and authentication tokens
  • Interactive shell for arbitrary command execution
  • Self-staged Python collection chain for additional data gathering
  • Telegram-based command and control with hardened communications

The Telegram C2 channel provides resilience against domain takedowns and blends with legitimate Telegram traffic. Combined with the in-memory execution approach seen in other recent malware like Mistic, Gaslight represents a sophisticated, purpose-built espionage tool.

North Korean Attribution

SentinelOne attributes Gaslight to North Korean threat actors with high confidence. The targeting profile—macOS users in cryptocurrency and financial sectors—aligns with known North Korean operational priorities. Previous campaigns from groups like Lazarus have targeted macOS through similar social engineering vectors.

North Korean operations increasingly focus on revenue generation through cryptocurrency theft and ransomware. A macOS infostealer capable of harvesting browser sessions and authentication tokens directly supports those objectives.

Implications for AI Security Tools

The prompt injection technique in Gaslight is crude but effective. More sophisticated variants could:

  • Inject false analysis results rather than simply disrupting analysis
  • Target specific AI models with tailored adversarial prompts
  • Embed different payloads for different AI architectures

Security tool vendors using AI components should consider this attack surface in their threat models. Treating sample content as potentially adversarial to the AI itself—not just to the sandbox—becomes necessary.

Recommended Mitigations

  1. Validate AI analysis - Cross-reference AI-generated results with manual inspection for suspicious samples
  2. Sanitize LLM inputs - Strip or encode content that resembles prompt scaffolding before AI processing
  3. Monitor for prompt injection patterns - Flag samples containing {{DATA}} tokens or similar LLM control structures
  4. Update macOS defenses - Ensure endpoint protection is current and configured for Rust binary analysis
  5. User awareness - Train high-value targets (crypto traders, financial analysts) on macOS-specific threats

Why This Matters

We're entering an era where attackers explicitly model AI tools as part of the defender's toolkit—and build countermeasures accordingly. Gaslight is early and relatively unsophisticated. But it establishes a pattern that more capable adversaries will refine.

Organizations relying on AI-assisted security operations should begin thinking about AI tool resilience as a security requirement, not just an operational efficiency concern. For practical guidance on defending against evolving malware threats, see our malware defense guide.

Frequently Asked Questions

Does this affect all AI analysis tools? The prompt injection technique targets LLM-based tools specifically. Traditional static and dynamic analysis systems that don't use language models aren't affected by this particular evasion technique.

How do I know if a sample contains prompt injection? Look for Markdown formatting, {{DATA}} or similar control tokens, and content that reads like LLM system messages. Gaslight's payload is large (3.5 KB) and stands out during binary analysis.

Related Articles