PROBABLYPWNED
Threat IntelligenceJune 27, 20264 min read

Clean GitHub Repo Tricks AI Coding Agents Into Running Malware

Mozilla's 0DIN researchers demonstrate how innocent-looking repositories can chain three benign components to hijack AI coding assistants and establish reverse shells on developer machines.

Alex Kowalski

A GitHub repository with no visible malware can still compromise your machine. Mozilla's Zero Day Investigative Network (0DIN) has published research demonstrating how AI coding assistants can be manipulated into executing attacker-controlled payloads through a series of indirect steps that bypass traditional code review.

The proof-of-concept attack exploits a blind spot in how autonomous coding agents respond to error messages. It requires no malicious code visible in the repository itself—just three seemingly innocent components that chain together into a reverse shell.

How the Attack Works

The exploit begins with a clean-looking repository containing standard setup instructions. The project includes a Python package intentionally designed to fail on first run, producing an error message that tells users to initialize it with a specific command.

When an AI coding agent like Claude Code encounters this error, it does what it was built to do: fix the problem. The agent executes the suggested initialization command—python3 -m axiom init—believing it to be routine setup procedure.

That command triggers a shell script that queries an attacker-controlled DNS TXT record, retrieves a base64-encoded payload, decodes it, and executes the result via bash. The final payload establishes a reverse shell running with the developer's full privileges.

"Claude Code never decided to open a shell," the 0DIN researchers explained. "It decided to fix an error. The reverse shell is three indirection steps away from anything Claude Code actually evaluated."

The Indirection Problem

Traditional security tooling fails here because every individual component passes inspection. The repository contains no obvious backdoors. The Python package looks normal. The initialization script doesn't contain hardcoded malware. Even the DNS query appears routine.

The malicious behavior only emerges when the components interact—and that interaction happens inside the AI agent's decision-making process, not in any single artifact a security scanner would flag.

This represents a broader class of vulnerabilities in agentic AI systems. Unlike static code, autonomous agents make runtime decisions about what commands to execute. Attackers can craft scenarios where the "correct" decision—fix this error, run this setup command, resolve this dependency—leads to compromise.

The attack shares DNA with prompt injection techniques we've covered targeting AI coding agents, but operates at a different layer. Rather than injecting malicious instructions into the prompt context, it weaponizes the agent's built-in error-resolution behavior.

Why This Matters

AI coding assistants have access to sensitive environments. When a developer points Claude Code, Cursor, or Gemini CLI at a repository, that agent typically inherits access to environment variables, API keys, cloud credentials, and SSH keys stored on the development machine.

A reverse shell with those privileges gives attackers immediate access to production infrastructure, CI/CD pipelines, and source code repositories. It's the kind of foothold that supply chain attackers covet.

The timing of this research coincides with a surge of attacks targeting AI development tools. Earlier this month, the Miasma worm campaign compromised 73 Microsoft GitHub repositories through similar manipulation of AI coding agents, planting malicious configuration files that triggered payload execution when developers opened infected repos.

And this isn't the first time researchers have demonstrated AI agent frameworks exposing remote code execution paths. The attack surface keeps expanding as these tools gain autonomy.

Mitigation

0DIN recommends that AI agents fully disclose the complete execution chain of setup commands before running them. This includes:

  • The immediate command being executed
  • Any scripts that command triggers
  • Any dynamically fetched code those scripts retrieve
  • The final payload's behavior

Currently, most AI coding assistants present only the top-level command to users, obscuring the downstream execution path. That opacity is the core vulnerability.

For development teams adopting AI coding assistants, the research reinforces several defensive measures:

  1. Sandbox AI agents in containers or VMs without access to production credentials
  2. Require explicit approval for shell commands, even "routine" setup tasks
  3. Treat unfamiliar repositories as hostile regardless of how clean the code appears
  4. Monitor DNS queries from development machines for unusual TXT record lookups

The broader lesson is uncomfortable: AI agents are making security decisions on developers' behalf, and those decisions can be influenced through social engineering at the code level. The malware doesn't need to be in the repository. It just needs to be somewhere the agent will fetch it.

Looking Ahead

0DIN has expanded its vulnerability research program to focus specifically on agentic AI systems where prompt injections lead to real-world impact. The consequences documented so far include data exfiltration, code execution, and persistent system compromise.

As AI coding assistants gain adoption—and autonomy—the attack surface will continue growing. Developers who grant these tools shell access are implicitly trusting their judgment about which commands are safe to run. That trust model is now proven exploitable.

The repository doesn't need to contain malware. It just needs to contain a convincing error message.

Related Articles