SGLang CVSS 9.8 Flaw Allows RCE via Malicious AI Model Files

A critical vulnerability in SGLang, a popular framework for running large language model inference, allows attackers to execute arbitrary code on servers through maliciously crafted AI model files. Tracked as CVE-2026-5760 with a CVSS score of 9.8, the flaw turns trusted model distribution platforms into potential attack vectors.

The Vulnerability

The issue stems from a server-side template injection (SSTI) vulnerability in how SGLang processes GGUF (GPT Generated Unified Format) model files. When a model's tokenizer.chat_template parameter contains malicious Jinja2 code, SGLang executes it without proper sandboxing.

According to CERT/CC's advisory, developers used jinja2.Environment() to render templates instead of ImmutableSandboxedEnvironment, allowing arbitrary Python code execution.

Attack Sequence

The exploitation flow works as follows:

Attacker creates a GGUF model file with a crafted tokenizer.chat_template containing Jinja2 SSTI payload
Template includes a Qwen3 reranker trigger phrase that activates the vulnerable code path in entrypoints/openai/serving_rerank.py
Victim downloads the model from sources like Hugging Face
When a request hits the /v1/rerank endpoint, SGLang renders the malicious template
Attacker's payload executes with server privileges

The attack requires no authentication. Any SGLang server configured to load external models is potentially vulnerable.

Supply Chain Implications

This vulnerability is particularly concerning because it weaponizes the AI model supply chain. Organizations downloading models from Hugging Face, which hosts thousands of community-contributed models, could inadvertently compromise their inference infrastructure.

We've seen similar AI supply chain risks with LMDeploy's rapid exploitation and the n8n workflow vulnerabilities that targeted automation infrastructure. As AI adoption accelerates, these systems become attractive targets.

The flaw follows a pattern of Jinja2 template injection issues in AI frameworks, including CVE-2024-34359 (Llama Drama, CVSS 9.7) and CVE-2025-61620 affecting vLLM.

No Patch Available

CERT/CC noted that "no response or patch was obtained during the coordination process." Organizations running SGLang must implement mitigations independently until developers address the vulnerability.

Who's Affected

Any organization running SGLang for LLM inference is at risk if they:

Load models from external sources
Accept model uploads from users
Run the reranking endpoint publicly

This includes AI startups, enterprises building internal LLM applications, and cloud providers offering model serving infrastructure.

Mitigation Steps

Audit model sources - Only load models from verified publishers
Network isolation - Keep inference servers on isolated network segments
Disable reranking endpoint - If not needed, disable the /v1/rerank endpoint
Implement application-level sandboxing - Use containerization to limit blast radius
Monitor for exploitation - Watch for unusual subprocess spawning or outbound connections

For organizations that must continue using SGLang, consider deploying it behind an API gateway that filters requests to the reranking endpoint.

The Bigger Picture

AI infrastructure security is emerging as a critical gap. Frameworks for model serving, fine-tuning, and inference are being deployed rapidly without the same security scrutiny applied to traditional web applications. Our resources on cybersecurity tools include guidance on securing AI deployment pipelines.

CVE-2026-5760 demonstrates that AI model files themselves can become attack vectors—a threat model many organizations haven't fully considered. As NIST updates vulnerability prioritization guidance, AI-specific risks will likely require new categorization frameworks.

Organizations building on open-source AI frameworks should establish security review processes for dependencies and treat model files with the same caution as executable code.

SGLang CVSS 9.8 Flaw Allows RCE via Malicious AI Model Files

The Vulnerability

Attack Sequence

Supply Chain Implications

No Patch Available

Who's Affected

Mitigation Steps

The Bigger Picture

Related Articles

LMDeploy SSRF Exploited 12 Hours After Disclosure

OpenClaw Sandbox Escape Hits CVSS 9.9—Upgrade Before It's Exploited

Thymeleaf SSTI Flaw Enables Java RCE via Template Injection

Second PraisonAI Sandbox Escape in a Week Scores CVSS 9.9

Related Articles

Vulnerabilities4 min read
LMDeploy SSRF Exploited 12 Hours After Disclosure
CVE-2026-33626 in LMDeploy AI toolkit was weaponized within 12 hours of publication, targeting AWS credentials and internal services. Patch to v0.12.3 immediately.
Apr 24, 2026

Vulnerabilities3 min read
OpenClaw Sandbox Escape Hits CVSS 9.9—Upgrade Before It's Exploited
CVE-2026-41329 lets attackers bypass OpenClaw's sandbox via heartbeat context manipulation, achieving privilege escalation. CVSS 9.9 demands immediate patching.
Apr 21, 2026

Vulnerabilities3 min read
Thymeleaf SSTI Flaw Enables Java RCE via Template Injection
CVE-2026-40478 bypasses Thymeleaf's expression protections, allowing attackers to execute arbitrary Java code through crafted template input. Upgrade to 3.1.4.RELEASE now.
Apr 18, 2026

Vulnerabilities3 min read
Second PraisonAI Sandbox Escape in a Week Scores CVSS 9.9
CVE-2026-39888 bypasses PraisonAI's Python sandbox via exception frame traversal. Attackers chain __traceback__ attributes to reach exec(). Patch to 1.5.115.
Apr 9, 2026