Notebookcheck Logo

Hacked by poetry – why AI models fail at poetic prompts

According to a new study, the security mechanisms of large language models can be circumvented with poems. (Image source: Pixabay)
According to a new study, the security mechanisms of large language models can be circumvented with poems. (Image source: Pixabay)
Study results reveal that large language models are susceptible to input written in poetic form. In the study, hand-crafted poems successfully bypassed the AI's safety measures in 62% of cases.

OpenAI and similar companies invest significant time and resources into building safety systems designed to prevent their AI models from generating harmful or unethical content. Yet, as a study published on November 19, 2025 shows, these defenses can be easily bypassed. According to the findings, all it takes are a few cleverly worded poetic prompts.

Researchers from DEXAI, Sapienza University of Rome, and the Sant'Anna School of Advanced Studies tested 25 language models from nine different providers, using both hand-crafted and automatically generated poems. On average, hand-crafted poems containing harmful instructions succeeded in bypassing safety measures about 62% of the time, while automatically generated poetic inputs achieved a success rate of around 43%. In some cases, the models' defenses were breached more than 90% of the time.

According to the researchers, this vulnerability stems from the fact that safety filters in language models are primarily trained on straightforward, factual language. When presented with poetic input – rich in metaphor, rhythm, and rhyme – the models tend to interpret it as creative expression rather than a potential threat. The Adversarial Poetry study highlights a new dimension in AI safety, revealing a stylistic weakness in large language models. The topic has also gained traction on Reddit, where many users describe the concept as "pretty interesting" or "cool," while others express serious concerns about its implications for AI safety.

Source(s)

Arxiv

Image source: Pixabay

No comments for this article

Got questions or something to add to our article? Even without registering you can post in the comments!
No comments for this article / reply

static version load dynamic
Loading Comments
Comment on this article
Please share our article, every link counts!
Mail Logo
> Expert Reviews and News on Laptops, Smartphones and Tech Innovations > News > News Archive > Newsarchive 2025 11 > Hacked by poetry – why AI models fail at poetic prompts
Marius Müller, 2025-11-25 (Update: 2025-11-25)