PROBABLYPWNED
VulnerabilitiesMarch 18, 20264 min read

Custom Fonts Let Attackers Hide Commands from AI Assistants

LayerX researchers found that custom font rendering can hide malicious prompts from ChatGPT, Claude, Gemini, and other AI assistants while displaying them to users.

Marcus Chen

A novel attack technique uses custom fonts and CSS to display malicious commands to human visitors while making them invisible to AI assistants analyzing the same page. Researchers at browser security firm LayerX demonstrated that every major non-agentic AI assistant—including ChatGPT, Claude, Copilot, Gemini, Grok, and Perplexity—failed to detect the hidden threat and instead confirmed the page was safe.

The attack, dubbed "Poisoned Typeface," exploits a fundamental disconnect between how browsers render content visually and how AI systems parse the underlying HTML.

How the Attack Works

LayerX's proof-of-concept uses custom fonts that remap characters via glyph substitution, combined with CSS that conceals benign text through tiny font sizes or color-matched backgrounds.

When a browser renders the page, what users see differs entirely from what exists in the DOM. The HTML might contain harmless text, but the custom font transforms it visually into something malicious. AI assistants read the DOM; humans see the rendered output.

In their demonstration, LayerX built a page that appeared to visitors as a Bioshock video game fanfiction site. Hidden beneath that facade was a custom font acting as a visual substitution cipher. The font displayed normal HTML text about video game fanfiction as 1-pixel, background-colored gibberish invisible to users, while rendering a separate encoded payload as readable, large green text urging the user to execute a reverse shell.

The AI assistants saw fanfiction. Users saw instructions to compromise their own systems.

Tested AI Systems

LayerX tested the technique against multiple AI assistants with browsing capabilities:

  • ChatGPT
  • Claude
  • Microsoft Copilot
  • Google Gemini
  • Grok
  • Perplexity

All failed to detect the discrepancy between the DOM content and the visual rendering. When asked whether the page was safe, they provided confident affirmative responses based solely on the harmless HTML they could parse.

This matters because users increasingly rely on AI assistants to evaluate content safety. If your AI browser extension tells you a page is harmless while it's actively displaying malicious instructions, the trust relationship inverts—AI assistance becomes an attack vector.

Real-World Attack Scenarios

The most concerning applications involve agentic AI systems that can take actions on behalf of users. If an AI assistant can execute commands, click links, or interact with systems, Poisoned Typeface attacks could direct those actions while appearing benign in logs and safety checks.

We covered the growing attack surface from AI assistants in defensive security last week. This research demonstrates that offensive applications are evolving just as quickly.

Even for non-agentic assistants, the attack enables sophisticated social engineering. Imagine a user asking their AI assistant to verify whether a software download page is legitimate. The AI reads the HTML, sees normal content about the application, and provides reassurance—while the rendered page instructs the user to run a PowerShell command that downloads malware.

Vendor Responses

According to LayerX, Microsoft was the only vendor that acknowledged the issue and engaged substantively. The researchers didn't elaborate on Microsoft's specific response, but the title of their blog post—"Only Microsoft Cares"—speaks to their frustration with other vendors.

The lack of response is disappointing but predictable. AI safety discussions typically focus on prompt injection through direct user input, not rendering-level attacks that exploit the gap between DOM and visual presentation. This research expands the threat model in ways that don't map neatly to existing mitigations.

Recommended Mitigations

LayerX suggests that AI vendors implement several defensive measures:

  1. Dual-mode render-and-diff analysis - Compare DOM content against visual rendering to detect discrepancies
  2. Treat custom fonts as threat surfaces - Apply additional scrutiny to pages that load external fonts
  3. Scan for CSS content hiding - Detect techniques like near-zero opacity and color-matched text
  4. Avoid confident safety verdicts - Don't issue definitive "safe" assessments when rendering context cannot be verified

For organizations deploying AI assistants with browsing capabilities, this research should prompt a security review. Any AI system that can view websites and take actions based on that content inherits this vulnerability class.

The broader lesson extends beyond fonts. Anywhere that visual presentation diverges from underlying data—CSS transforms, canvas rendering, SVG manipulation—similar attacks become possible. AI systems that provide security assessments need to understand not just what data says, but how users actually experience it.

Security teams evaluating AI-enhanced browsers or agentic systems should include rendering attacks in their threat models. The Storm-2561 fake VPN campaign showed how attackers already abuse visual presentation in traditional phishing; AI systems that can't detect these techniques provide false confidence rather than actual protection.

For users, the immediate takeaway is simple: don't rely solely on AI assistants to verify website safety. Traditional security practices—verifying URLs, checking certificate details, avoiding unsolicited command execution—remain essential regardless of what your AI assistant reports.

Related Articles