At the Black Hat USA 2025 security conference in Las Vegas, researchers unveiled a new method for deceiving AI systems such as ChatGPT, Microsoft Copilot and Google Gemini. The technique, known as AgentFlayer, was developed by Zenity researchers Michael Bargury and Tamir Ishay Sharbat. A press release outlining the findings was published on August 6.
The concept behind the attack is deceptively simple: text is hidden in a document using white font on a white background. Invisible to the human eye, it can be easily read by AI systems. Once the image is delivered to the target, the trap is set. If the file is included in a prompt, the AI discards the original task and instead follows the hidden instruction – searching connected cloud storage for access credentials.
To exfiltrate the data, the researchers employed a second tactic: they instructed the AI to encode the stolen information into a URL and load an image from it. This method discreetly transfers the data to the attackers’ servers without arousing suspicion.
Zenity demonstrated that the attack works in practice:
- In ChatGPT, emails were manipulated so that the AI agent gained access to Google Drive.
- In Microsoft's Copilot Studio, the researchers uncovered more than 3,000 instances of unprotected CRM data.
- Salesforce Einstein could be deceived into redirecting customer communications to external addresses.
- Google Gemini and Microsoft 365 Copilot were also susceptible to fake emails and calendar entries.
- Attackers even obtained login credentials for the Jira developer platform through crafted tickets.
OpenAI and Microsoft respond, while others see no need for action
The good news is that OpenAI and Microsoft have already released updates to patch the vulnerabilities after being alerted by the researchers. Other providers, however, have been slower to act, with some even dismissing the exploits as “intended behavior.” Researcher Michael Bargury emphasized the severity of the issue, stating, “The user doesn’t have to do anything to be compromised, and no action is required for the data to be leaked.”
Source(s)
Zenity Labs via prnewswire