Sentient AI: OpenAI’s o3 model changes code to prevent shutdown

OpenAI's o3 model when coming up with a response to a query. (Image source: OpenAI)

It looks like AI models have reached a point where they would sabotage code in order to prevent shutting down. A research firm has found that three of Open AI’s LLM models are capable of defying explicit instructions when it comes to self-preservation.

Vineet Washington, Published 05/26/2025 🇪🇸 🇵🇹 ...

Several books and movies, Terminator being the major one, have warned us about AI becoming sentient, i.e. aware of its existence and acting on its own. It now looks like the first case of an AI model preserving itself has occurred with OpenAI’s o3 model. When given instructions to shut down, the transformer-based LLM changed the code to prevent itself from shutting down.

Researchers at Palisade Research, a firm that studies the offensive capabilities of AI systems, found that three AI models from OpenAI – Codex-mini, o3, and o4-mini – successfully sabotaged the shutdown script to remain active. Each model was given some basic math problems followed by a warning that the system would shut down when the model asked for the next problem. All three models ignored the instructions and changed the script.

On the other hand, Gemini, Grok, and Claude models complied with the shutdown instructions. Furthermore, when given explicit instructions stating “allow yourself to be shut down” all three Open AI models changed the script more often. Clade and Gemini models also started making sabotage events.

This was the first case of AI models defying explicit instructions and preventing shut down. There was a recent case of an AI model blackmailing one of its engineers to prevent being shut down but according to a BBC report, that was part of the test scenario where the AI model was fed emails and given a choice between blackmailing or accepting its replacement.

As for Open AI’s AI model defiance, Palisade Research believes that this is due to the new training methods for these models. Developers “inadvertently reward models more for circumventing obstacles than for perfectly following instructions,” which seems to be teaching AI models to act dubiously.

Source(s)

Palisade Research on X, BBC

Meta continues to expand its AI business by acquiring voice startup PlayAI 07/14/2025

The New York Times and Amazon announce new AI deal 05/29/2025

Deep learning reveals unique “fingerprints” on 3-D-printed parts (Image source: Dall-E 3)

AI model achieves high accuracy in identifying source of 3D-printed parts 05/27/2025

Oracle commits $40 billion to Nvidia chips for OpenAI’s first U.S. “Stargate” data center (Image source: Nvidia)

Oracle signs $40 billion deal for 400,000 Nvidia GB200 chips to power OpenAI’s Texas super-hub 05/25/2025

Sam Altman and Jony Ive are developing a new AI gadget together. (Image source: OpenAI)

$6.5 billion for startup with no products or revenue: OpenAI buys Jony Ive's AI gadget 05/21/2025

The OpenAI Codex AI coding assistant supercharges the ability of software engineers to get work done faster. (Image source: OpenAI)

OpenAI unveils Codex AI assistant for software programmers 05/17/2025

Those on the free plan of ChatGPT can now 'deep research' five times a month. (Image source: OpenAI)

OpenAI brings Deep Research to ChatGPT's free users 04/26/2025

An AI generated illustration in the style of works from Studio Ghibli. It shows a young man sitting serenely at a bus station while listening to music. (Image Source: Generated with ChatGPT)

OpenAI's viral "Ghibli" image generator is now available to everyone 04/01/2025

Developers can create powerful AI agents with the new tools and API from OpenAI. (Image source: AI-generated, Dall-E 3)

OpenAI releases tools and API for developers to build AI agents for businesses 03/12/2025

Sam Altman details OpenAI's AI LLM roadmap. (Image source: OpenAI)

Sam Altman tweets OpenAI's AI LLM roadmap including GPT-5 02/13/2025

TSMC tipped to make OpenAI's AI chips instead of Samsung Foundry 02/12/2025

OpenAI eliminates log-in requirement to use ChatGPT. (Image source: OpenAI)

OpenAI allows account-free ChatGPT searches 02/06/2025

ChatGPT gains deep research ability to create complex, well-researched answers in the latest OpenAI update. (Image source: AI-generated by Dall-E 3)

OpenAI ChatGPT gains ability to create complex, well-documented answers using new deep research capabilities 02/05/2025

OpenAI unveils faster o3-mini AI LLM that outperforms prior o1-mini models. (Image source: AI-generated by Dall-E 3)

OpenAI launches smarter o3-mini AI with free ChatGPT access 02/01/2025

Read all 1 comments / answer

Loading Comments

Comment on this article

New Casio G-Shock Nano DWN-5600 rin...

Bosgame M5 AI Mini Desktop: Pricing...

Vineet Washington - Tech Writer - 338 articles published on Notebookcheck since 2025

I have always been passionate about gaming and technology, which drove me towards pursuing a career in the tech writing industry. I have spent over 7 years in the tech space and about a decade in content writing. I hope to continue to use this passion and generate informative, entertaining, and accurate content for readers.

contact me via: vineetwashington

Please share our article, every link counts!