vLLM CVSS 9.8 Flaw Lets Attackers Own AI Servers via Video
CVE-2026-22778 chains a heap leak and buffer overflow in vLLM's video processing to achieve full RCE on AI inference servers. Patch to 0.14.1 now.
A critical remote code execution vulnerability in vLLM, the open-source inference engine powering millions of AI deployments, gives attackers a direct path from a malicious video URL to full server compromise. CVE-2026-22778 carries a CVSS score of 9.8 and affects versions 0.8.3 through 0.14.0—a span covering roughly two years of releases.
The flaw is especially dangerous because out-of-the-box vLLM installations don't require authentication on API endpoints. Any attacker with network access can send a single crafted request and take over the underlying server.
How the Exploit Chain Works
This isn't a single bug. Researchers at OX Security documented a two-stage exploit chain that defeats modern memory protections:
Stage 1 — Information Leak. When vLLM processes an invalid image, Python's PIL library throws an error message that includes a raw heap memory address. That leak reduces ASLR entropy from roughly 4 billion possible address combinations down to about 8 guesses. Enough to brute-force reliably.
Stage 2 — Heap Buffer Overflow. vLLM bundles OpenCV, which in turn ships FFmpeg 5.1.x. The JPEG2000 decoder in that FFmpeg version trusts the image's cdef (channel definition) box without validating buffer sizes. An attacker constructs a video from JPEG2000 frames where the Y-channel data is significantly larger than the U/V buffers. The decoder writes Y data into the smaller U buffer, overflowing into adjacent heap memory.
From there, it's textbook heap exploitation—overwrite function pointers, redirect execution to system(), and run arbitrary commands as the vLLM process user.
Who's at Risk
vLLM reportedly exceeds three million downloads per month. Organizations running multimodal models with video support enabled are the primary targets, but the vulnerability exists in any vLLM installation that processes image or video inputs.
Internet-facing deployments face the highest risk. And in clustered GPU environments—common for large-scale inference—compromising one node can open lateral movement paths across the entire cluster.
The vulnerability sits in a dependency chain (vLLM → OpenCV → FFmpeg), which means organizations that don't track transitive dependencies may not realize they're exposed. This kind of supply chain depth is a recurring pattern in AI framework vulnerabilities.
What to Do Now
- Upgrade to vLLM 0.14.1, which bundles a patched OpenCV release addressing the JPEG2000 decoder flaw
- Disable video model functionality in production if upgrading isn't immediately feasible
- Put authentication in front of vLLM API endpoints — this should have been table stakes from day one, but many deployments skip it
- Audit network exposure — vLLM endpoints should never be directly internet-facing without a reverse proxy and access controls
Why This Matters
AI infrastructure is becoming the new attack surface, and the security posture of most ML serving frameworks hasn't caught up. vLLM is used everywhere from startup prototypes to enterprise production systems, and its default "open to the network" configuration means the blast radius of a single CVE can be enormous.
This vulnerability also highlights a growing concern: the dependency chains in AI frameworks are deep and complex. vLLM itself wasn't the buggy code—OpenCV's bundled FFmpeg was. But vLLM users bear the consequences. As organizations race to deploy AI systems at scale, the security of inference infrastructure deserves the same scrutiny we give to web application servers and databases.
We've seen similar issues with n8n's sandbox escape flaws and other workflow automation tools. The pattern is the same: powerful execution environments with insufficient isolation, exposed to the network by default.
For teams running AI workloads, the message is clear: treat your inference servers like the critical infrastructure they are.
Related Articles
OpenSSL Stack Overflow Enables Remote Code Execution
CVE-2025-15467 allows attackers to crash or compromise systems by sending malicious CMS messages. All AI-discovered in OpenSSL's largest coordinated security release.
Jan 29, 2026Claude Code Flaws Let Malicious Repos Steal API Keys, Run Code
Check Point found CVE-2025-59536 and CVE-2026-21852 in Anthropic's Claude Code. Opening a cloned repo could execute code and leak API credentials.
Feb 26, 2026Microsoft Copilot Bug Exposed Confidential Emails for Weeks
Microsoft confirms Copilot bug bypassed DLP policies, reading confidential emails without authorization. European Parliament blocked Copilot over concerns.
Feb 25, 2026Serv-U Type Confusion Bug Enables Privileged Code Execution
CVE-2025-40540 is a critical type confusion vulnerability in SolarWinds Serv-U with CVSS 9.1. Attackers with admin access can execute arbitrary code.
Feb 24, 2026