PROBABLYPWNED
VulnerabilitiesMay 31, 20264 min read

ChatGPhish Turns ChatGPT Web Summaries Into Phishing Delivery

Researchers discover ChatGPT's Markdown rendering trusts attacker-controlled content from summarized pages, enabling phishing URLs, IP exfiltration, and fake security alerts inside the AI interface.

Marcus Chen

ChatGPT cannot tell the difference between its own generated content and attacker-controlled Markdown pulled from external sources. Researchers at Permiso Security discovered that when users ask the chatbot to summarize a web page containing hidden instructions, the page itself becomes the payload.

The technique, dubbed ChatGPhish, exploits how chatgpt.com's response renderer handles Markdown elements from summarized content. Links and image URLs originating from third-party pages render as live, clickable elements inside the trusted ChatGPT interface—without any verification of their source or intent.

How the Attack Works

When ChatGPT summarizes a page, it parses the content and renders any Markdown it encounters. An attacker plants hidden text containing Markdown links and images on a web page they control. The moment a victim asks ChatGPT to summarize that page, the malicious content becomes part of the AI's response.

The attack surface is broad. Attackers can inject phishing URLs that render as legitimate-looking links inside ChatGPT's interface. They can display fake security alerts written in ChatGPT's own conversational style, tricking users into believing the AI itself is warning them. And they can embed QR codes from external servers, bypassing desktop URL filters entirely since the code only needs to be scanned.

This follows a pattern we've seen with AI agents being exploited post-compromise—the implicit trust users place in AI systems creates blind spots that attackers are learning to exploit.

The IP Exfiltration Problem

Beyond phishing, there's a passive intelligence-gathering angle. Attacker-hosted images embedded in the page are automatically fetched every time ChatGPT renders the response. Each render leaks the victim's IP address, User-Agent string, Referer header, and high-resolution timing data tied to the exact moment the AI produced the answer.

For targeted attacks, this metadata is valuable. It reveals when a specific person used ChatGPT to research a particular topic, and from what network. Combined with the content of the page they were researching, it paints a detailed picture.

Enterprise Security Implications

Organizations that rely on ChatGPT for research workflows face elevated risk. A single employee summarizing a malicious page could expose internal network information or fall for a phishing attack delivered through an interface they trust implicitly.

The attack is particularly effective because it doesn't require victims to do anything unusual. They're not clicking suspicious links in emails or downloading unknown files—they're using ChatGPT exactly as intended. The malicious payload arrives wrapped in the AI's familiar response format.

For organizations concerned about social engineering tactics, this represents a new delivery mechanism that bypasses traditional security awareness training. Users are taught to scrutinize emails and websites, not AI chatbot responses.

OpenAI's Position

Permiso researcher Andi Ahmeti reported the issue to OpenAI. At time of publication, OpenAI had not confirmed whether a fix has been applied, leaving the chatbot potentially vulnerable to this attack vector.

OpenAI has previously acknowledged that prompt injection may never be fully "solved" for browser agents. In December 2025, the company stated that AI-powered browsing tools will likely remain susceptible to some forms of manipulation indefinitely. The ChatGPhish technique demonstrates this isn't just a theoretical concern—it's an active exploitation surface.

Why This Matters

Prompt injection attacks against AI systems have been discussed in security circles for years, but most examples required convincing users to paste malicious prompts directly. ChatGPhish removes that friction. Users simply ask ChatGPT to summarize a page, something millions do daily for research, learning, and productivity.

The vulnerability also highlights a broader problem: as AI tools become more integrated into workflows, the attack surface they represent grows. Every capability an AI system has—rendering content, fetching images, following links—becomes a potential exploitation vector when the AI cannot distinguish trusted from untrusted input.

Security teams should monitor how employees use AI tools and consider whether certain use cases warrant additional review. Summarizing content from unknown sources carries risk that wasn't obvious before this disclosure.

Protecting Yourself

Until OpenAI addresses this issue, users should treat ChatGPT's web summary feature with appropriate skepticism. When summarizing pages you don't control, review the response carefully for unexpected links or images. Don't click links in summaries of unfamiliar pages without verifying them independently.

Organizations with ChatGPT access policies should update guidance to reflect this risk. The same skepticism applied to phishing emails should extend to AI-summarized content from untrusted sources.

The line between what the AI generated and what attackers injected is invisible to users—and that's exactly what makes ChatGPhish dangerous.

Related Articles