PROBABLYPWNED
Threat IntelligenceJuly 5, 20263 min read

Attackers Turn Stolen AI Compute Into Autonomous Hacking Tools

LLMjacking evolves as threat actors hijack exposed Ollama servers to power multi-stage VAPT frameworks that scan, exploit, and compromise targets automatically.

Alex Kowalski

The Sysdig Threat Research Team observed a threat actor on June 12 using a misconfigured Ollama server as the reasoning engine for an automated offensive security tool. The attacker wasn't stealing compute for cryptomining or API resale—they were building an autonomous penetration testing framework that scans targets, identifies vulnerabilities, generates exploits, and attempts command execution without human guidance.

This marks a significant evolution in LLMjacking, the practice of hijacking cloud AI resources. Instead of treating stolen compute as a commodity to sell, attackers are now weaponizing it directly.

How the VAPT Framework Works

The tool, which the attacker called VAPT (Vulnerability Assessment and Penetration Testing), chains together multiple stages with the Ollama model making decisions at each step:

  1. Service fingerprinting — Identifies what's running on target systems
  2. Vulnerability matching — Correlates services against known CVE databases
  3. Web reconnaissance — Maps application structure and endpoints
  4. PoC generation — Creates proof-of-concept exploits for identified vulnerabilities
  5. SQL injection crafting — Builds database attack payloads
  6. Secret extraction — Hunts for credentials and API keys
  7. Privilege escalation — Attempts to gain elevated access

The framework uses a distinctive RCE verification pattern: echo VAPTb3gin; id; echo VAPTfin. If the output contains both markers plus valid uid output, the tool knows command execution succeeded.

Why Ollama Is the Target

Ollama doesn't require authentication by default. Any server bound to a public interface on port 11434 will respond to inference requests from anyone. Sysdig previously identified over 175,000 publicly exposed Ollama instances across 130 countries, and many remain unprotected.

The compute itself isn't cheap. Running inference workloads for offensive operations requires significant GPU resources. By hijacking existing infrastructure, attackers get free access to capabilities that would otherwise cost thousands monthly.

From Commodity to Weapon

Earlier LLMjacking campaigns focused on reselling stolen API access or mining cryptocurrency with hijacked GPUs. The VAPT framework represents something more concerning: threat actors are treating AI infrastructure as offensive capability rather than just valuable real estate.

The tool exposes a limited surface to the model—request primitives, payload builders, and sweep functions—but that surface is enough to automate attack chains that previously required skilled operators. The model handles the judgment calls: which vulnerabilities to prioritize, how to craft payloads, when to escalate.

Securing Ollama Deployments

Organizations running Ollama should:

  • Never expose port 11434 to the internet — Bind to localhost or internal networks only
  • Add authentication — Use a reverse proxy with auth in front of the API
  • Monitor inference patterns — Watch for unusual prompt volumes or offensive tool signatures
  • Audit public cloud instances — Check whether GPU instances are externally accessible

The broader concern is the trajectory. If compromised AI infrastructure becomes standard offensive tooling, the bar for launching sophisticated attacks drops considerably. The AI-powered ransomware operations we've seen this year may be early examples of a much larger pattern.

Security teams should consider exposed AI infrastructure a high-priority asset, comparable to edge devices or identity systems. The consequences of compromise are no longer limited to compute theft.

Related Articles