SGLang CVSS 9.8 Flaw Allows RCE via Malicious AI Model Files
Critical CVE-2026-5760 in SGLang enables unauthenticated RCE through poisoned GGUF model files. Attackers can weaponize Hugging Face models to compromise inference servers.
A critical vulnerability in SGLang, a popular framework for running large language model inference, allows attackers to execute arbitrary code on servers through maliciously crafted AI model files. Tracked as CVE-2026-5760 with a CVSS score of 9.8, the flaw turns trusted model distribution platforms into potential attack vectors.
The Vulnerability
The issue stems from a server-side template injection (SSTI) vulnerability in how SGLang processes GGUF (GPT Generated Unified Format) model files. When a model's tokenizer.chat_template parameter contains malicious Jinja2 code, SGLang executes it without proper sandboxing.
According to CERT/CC's advisory, developers used jinja2.Environment() to render templates instead of ImmutableSandboxedEnvironment, allowing arbitrary Python code execution.
Attack Sequence
The exploitation flow works as follows:
- Attacker creates a GGUF model file with a crafted
tokenizer.chat_templatecontaining Jinja2 SSTI payload - Template includes a Qwen3 reranker trigger phrase that activates the vulnerable code path in
entrypoints/openai/serving_rerank.py - Victim downloads the model from sources like Hugging Face
- When a request hits the
/v1/rerankendpoint, SGLang renders the malicious template - Attacker's payload executes with server privileges
The attack requires no authentication. Any SGLang server configured to load external models is potentially vulnerable.
Supply Chain Implications
This vulnerability is particularly concerning because it weaponizes the AI model supply chain. Organizations downloading models from Hugging Face, which hosts thousands of community-contributed models, could inadvertently compromise their inference infrastructure.
We've seen similar AI supply chain risks with LMDeploy's rapid exploitation and the n8n workflow vulnerabilities that targeted automation infrastructure. As AI adoption accelerates, these systems become attractive targets.
The flaw follows a pattern of Jinja2 template injection issues in AI frameworks, including CVE-2024-34359 (Llama Drama, CVSS 9.7) and CVE-2025-61620 affecting vLLM.
No Patch Available
CERT/CC noted that "no response or patch was obtained during the coordination process." Organizations running SGLang must implement mitigations independently until developers address the vulnerability.
Who's Affected
Any organization running SGLang for LLM inference is at risk if they:
- Load models from external sources
- Accept model uploads from users
- Run the reranking endpoint publicly
This includes AI startups, enterprises building internal LLM applications, and cloud providers offering model serving infrastructure.
Mitigation Steps
- Audit model sources - Only load models from verified publishers
- Network isolation - Keep inference servers on isolated network segments
- Disable reranking endpoint - If not needed, disable the
/v1/rerankendpoint - Implement application-level sandboxing - Use containerization to limit blast radius
- Monitor for exploitation - Watch for unusual subprocess spawning or outbound connections
For organizations that must continue using SGLang, consider deploying it behind an API gateway that filters requests to the reranking endpoint.
The Bigger Picture
AI infrastructure security is emerging as a critical gap. Frameworks for model serving, fine-tuning, and inference are being deployed rapidly without the same security scrutiny applied to traditional web applications. Our resources on cybersecurity tools include guidance on securing AI deployment pipelines.
CVE-2026-5760 demonstrates that AI model files themselves can become attack vectors—a threat model many organizations haven't fully considered. As NIST updates vulnerability prioritization guidance, AI-specific risks will likely require new categorization frameworks.
Organizations building on open-source AI frameworks should establish security review processes for dependencies and treat model files with the same caution as executable code.
Related Articles
LMDeploy SSRF Exploited 12 Hours After Disclosure
CVE-2026-33626 in LMDeploy AI toolkit was weaponized within 12 hours of publication, targeting AWS credentials and internal services. Patch to v0.12.3 immediately.
Apr 24, 2026OpenClaw Sandbox Escape Hits CVSS 9.9—Upgrade Before It's Exploited
CVE-2026-41329 lets attackers bypass OpenClaw's sandbox via heartbeat context manipulation, achieving privilege escalation. CVSS 9.9 demands immediate patching.
Apr 21, 2026Thymeleaf SSTI Flaw Enables Java RCE via Template Injection
CVE-2026-40478 bypasses Thymeleaf's expression protections, allowing attackers to execute arbitrary Java code through crafted template input. Upgrade to 3.1.4.RELEASE now.
Apr 18, 2026Second PraisonAI Sandbox Escape in a Week Scores CVSS 9.9
CVE-2026-39888 bypasses PraisonAI's Python sandbox via exception frame traversal. Attackers chain __traceback__ attributes to reach exec(). Patch to 1.5.115.
Apr 9, 2026