vLLM CVSS 9.8 Flaw Lets Attackers Own AI Servers via Video

A critical remote code execution vulnerability in vLLM, the open-source inference engine powering millions of AI deployments, gives attackers a direct path from a malicious video URL to full server compromise. CVE-2026-22778 carries a CVSS score of 9.8 and affects versions 0.8.3 through 0.14.0—a span covering roughly two years of releases.

The flaw is especially dangerous because out-of-the-box vLLM installations don't require authentication on API endpoints. Any attacker with network access can send a single crafted request and take over the underlying server.

How the Exploit Chain Works

This isn't a single bug. Researchers at OX Security documented a two-stage exploit chain that defeats modern memory protections:

Stage 1 — Information Leak. When vLLM processes an invalid image, Python's PIL library throws an error message that includes a raw heap memory address. That leak reduces ASLR entropy from roughly 4 billion possible address combinations down to about 8 guesses. Enough to brute-force reliably.

Stage 2 — Heap Buffer Overflow. vLLM bundles OpenCV, which in turn ships FFmpeg 5.1.x. The JPEG2000 decoder in that FFmpeg version trusts the image's cdef (channel definition) box without validating buffer sizes. An attacker constructs a video from JPEG2000 frames where the Y-channel data is significantly larger than the U/V buffers. The decoder writes Y data into the smaller U buffer, overflowing into adjacent heap memory.

From there, it's textbook heap exploitation—overwrite function pointers, redirect execution to system(), and run arbitrary commands as the vLLM process user.

Who's at Risk

vLLM reportedly exceeds three million downloads per month. Organizations running multimodal models with video support enabled are the primary targets, but the vulnerability exists in any vLLM installation that processes image or video inputs.

Internet-facing deployments face the highest risk. And in clustered GPU environments—common for large-scale inference—compromising one node can open lateral movement paths across the entire cluster.

The vulnerability sits in a dependency chain (vLLM → OpenCV → FFmpeg), which means organizations that don't track transitive dependencies may not realize they're exposed. This kind of supply chain depth is a recurring pattern in AI framework vulnerabilities.

What to Do Now

Upgrade to vLLM 0.14.1, which bundles a patched OpenCV release addressing the JPEG2000 decoder flaw
Disable video model functionality in production if upgrading isn't immediately feasible
Put authentication in front of vLLM API endpoints — this should have been table stakes from day one, but many deployments skip it
Audit network exposure — vLLM endpoints should never be directly internet-facing without a reverse proxy and access controls

Why This Matters

AI infrastructure is becoming the new attack surface, and the security posture of most ML serving frameworks hasn't caught up. vLLM is used everywhere from startup prototypes to enterprise production systems, and its default "open to the network" configuration means the blast radius of a single CVE can be enormous.

This vulnerability also highlights a growing concern: the dependency chains in AI frameworks are deep and complex. vLLM itself wasn't the buggy code—OpenCV's bundled FFmpeg was. But vLLM users bear the consequences. As organizations race to deploy AI systems at scale, the security of inference infrastructure deserves the same scrutiny we give to web application servers and databases.

We've seen similar issues with n8n's sandbox escape flaws and other workflow automation tools. The pattern is the same: powerful execution environments with insufficient isolation, exposed to the network by default.

For teams running AI workloads, the message is clear: treat your inference servers like the critical infrastructure they are.

vLLM CVSS 9.8 Flaw Lets Attackers Own AI Servers via Video

How the Exploit Chain Works

Who's at Risk

What to Do Now

Why This Matters

Related Articles

Langflow Flaw Lets Anyone Execute Code on AI Workflow Servers

AutoJack Turns AI Browsing Agents Into Zero-Click RCE Vectors

Splunk Enterprise Hit With Critical Unauthenticated RCE Flaw

LangGraph Vulnerability Chain Enables Full Server Takeover

Related Articles

Vulnerabilities4 min read
Langflow Flaw Lets Anyone Execute Code on AI Workflow Servers
CVE-2026-10134 enables unauthenticated RCE in Langflow OSS through public flows. Attackers can run arbitrary Python on servers via the build endpoint.
Jul 7, 2026

Vulnerabilities5 min read
AutoJack Turns AI Browsing Agents Into Zero-Click RCE Vectors
Microsoft discloses AutoJack, an exploit chain that hijacks AutoGen Studio AI agents via malicious webpages. A single URL visit triggers arbitrary code execution on the host machine.
Jun 22, 2026

Vulnerabilities3 min read
Splunk Enterprise Hit With Critical Unauthenticated RCE Flaw
CVE-2026-20253 scores CVSS 9.8 and allows network attackers to execute arbitrary code on Splunk Enterprise servers without authentication. No workaround exists—patching is mandatory.
Jun 14, 2026

Vulnerabilities4 min read
LangGraph Vulnerability Chain Enables Full Server Takeover
Check Point researchers chained SQL injection and unsafe deserialization flaws to achieve RCE on AI workflow platforms. Patch langgraph to 1.0.10+ immediately.
Jun 13, 2026