PROBABLYPWNED
VulnerabilitiesApril 26, 20263 min read

SGLang CVSS 9.8 Flaw Allows RCE via Malicious AI Model Files

Critical CVE-2026-5760 in SGLang enables unauthenticated RCE through poisoned GGUF model files. Attackers can weaponize Hugging Face models to compromise inference servers.

Marcus Chen

A critical vulnerability in SGLang, a popular framework for running large language model inference, allows attackers to execute arbitrary code on servers through maliciously crafted AI model files. Tracked as CVE-2026-5760 with a CVSS score of 9.8, the flaw turns trusted model distribution platforms into potential attack vectors.

The Vulnerability

The issue stems from a server-side template injection (SSTI) vulnerability in how SGLang processes GGUF (GPT Generated Unified Format) model files. When a model's tokenizer.chat_template parameter contains malicious Jinja2 code, SGLang executes it without proper sandboxing.

According to CERT/CC's advisory, developers used jinja2.Environment() to render templates instead of ImmutableSandboxedEnvironment, allowing arbitrary Python code execution.

Attack Sequence

The exploitation flow works as follows:

  1. Attacker creates a GGUF model file with a crafted tokenizer.chat_template containing Jinja2 SSTI payload
  2. Template includes a Qwen3 reranker trigger phrase that activates the vulnerable code path in entrypoints/openai/serving_rerank.py
  3. Victim downloads the model from sources like Hugging Face
  4. When a request hits the /v1/rerank endpoint, SGLang renders the malicious template
  5. Attacker's payload executes with server privileges

The attack requires no authentication. Any SGLang server configured to load external models is potentially vulnerable.

Supply Chain Implications

This vulnerability is particularly concerning because it weaponizes the AI model supply chain. Organizations downloading models from Hugging Face, which hosts thousands of community-contributed models, could inadvertently compromise their inference infrastructure.

We've seen similar AI supply chain risks with LMDeploy's rapid exploitation and the n8n workflow vulnerabilities that targeted automation infrastructure. As AI adoption accelerates, these systems become attractive targets.

The flaw follows a pattern of Jinja2 template injection issues in AI frameworks, including CVE-2024-34359 (Llama Drama, CVSS 9.7) and CVE-2025-61620 affecting vLLM.

No Patch Available

CERT/CC noted that "no response or patch was obtained during the coordination process." Organizations running SGLang must implement mitigations independently until developers address the vulnerability.

Who's Affected

Any organization running SGLang for LLM inference is at risk if they:

  • Load models from external sources
  • Accept model uploads from users
  • Run the reranking endpoint publicly

This includes AI startups, enterprises building internal LLM applications, and cloud providers offering model serving infrastructure.

Mitigation Steps

  1. Audit model sources - Only load models from verified publishers
  2. Network isolation - Keep inference servers on isolated network segments
  3. Disable reranking endpoint - If not needed, disable the /v1/rerank endpoint
  4. Implement application-level sandboxing - Use containerization to limit blast radius
  5. Monitor for exploitation - Watch for unusual subprocess spawning or outbound connections

For organizations that must continue using SGLang, consider deploying it behind an API gateway that filters requests to the reranking endpoint.

The Bigger Picture

AI infrastructure security is emerging as a critical gap. Frameworks for model serving, fine-tuning, and inference are being deployed rapidly without the same security scrutiny applied to traditional web applications. Our resources on cybersecurity tools include guidance on securing AI deployment pipelines.

CVE-2026-5760 demonstrates that AI model files themselves can become attack vectors—a threat model many organizations haven't fully considered. As NIST updates vulnerability prioritization guidance, AI-specific risks will likely require new categorization frameworks.

Organizations building on open-source AI frameworks should establish security review processes for dependencies and treat model files with the same caution as executable code.

Related Articles