OpenAI has launched GPT-5.4 Thinking and GPT-5.4 Pro, its latest AI models, which are promoted as more factual, efficient, and faster than predecessors, using fewer tokens. Key features include a 1-million-token context window, improved performance on computer-use tasks, and a 33% reduction in factual errors.
The models introduce new steerability, allowing users to interrupt and redirect responses mid-generation, and offer better retention over long conversations. They are being rolled out gradually across ChatGPT and the API, with pricing details provided.
Main topics covered: The announcement and key features of GPT-5.4, its improved efficiency and performance, new user interaction capabilities like mid-response steering, rollout availability and pricing, and enhanced safety evaluations.
OpenAI has unveiled GPTâ5.4 Thinking and GPTâ5.4 Pro, the latest upgrades to its GPT-5 family of artificial intelligence (AI) models, designed to provide solutions for professional workflows.
Taking to social media platform X, the company described GPTâ5.4 as its most âfactual and efficientâ model, using fewer tokens while providing faster responses. In ChatGPT, GPTâ5.4 Thinking offers improved deep web research and better context retention over longer interactions, the company said.
â....and ohâyou can now interrupt the model and add instructions or adjust its direction mid-response,â it added.
GPTâ5.4: Key features
The API version of GPTâ5.4 will support context windows as large as 1 million tokens, the largest available from OpenAI to date.
OpenAI emphasised the modelâs improved token efficiency, noting it can solve the same problems with far fewer tokens than its predecessor. To put that in context, tokens refer to the fundamental, smallest units of data that AI models (especially Large Language Models) use to process, understand, and generate text or images.
âGPTâ5.4 is our most token efficient reasoning model yet, using significantly fewer tokens to solve problems when compared to GPTâ5.2âtranslating to reduced token usage and faster speeds,â OpenAI said in a blog post.
âWeâve designed GPTâ5.4 to be performant across a wide range of computer-use workloads. It is excellent at writing code to operate computers via libraries such as Playwright, as well as issuing mouse and keyboard commands in response to screenshots â¦.. Developers can even configure the modelâs safety behavior to suit different levels of risk tolerance by specifying custom confirmation policies,â it wrote.
GPTâ5.4 shows record-breaking benchmark performance, including top scores in computer use benchmarks OSWorld-Verified and WebArena Verified, according to OpenAI. It achieved 83% on OpenAIâs GDPval test for knowledge work.
OpenAI has also continued its efforts to reduce hallucinations and factual errors. GPTâ5.4 is 33% less likely to make errors in individual claims compared with GPTâ5.2, and overall responses are 18% less likely to contain mistakes.
Steerability
GPTâ5.4 Thinking in ChatGPT introduces a preamble for longer, more complex queries, similar to Codex (OpenAIâs coding agent, which understands and generates code from natural language). Users can add instructions or change the modelâs direction mid-response, making it easier to guide outputs without starting over or requiring multiple additional turns.
This feature is already available on ChatGPT and the Android app, with iOS access coming soon.
The model can also think for longer on difficult tasks while maintaining a strong awareness of earlier conversation steps. This allows it to handle longer workflows and more complex prompts while keeping responses coherent and relevant throughout.
Availability and pricing
GPTâ5.4 is being gradually rolled out from Friday across ChatGPT and Codex.
Pricing:
Safety
A new safety evaluation has been added to examine the modelâs chain-of-thought (CoT), the running commentary used to explain reasoning in multi-step tasks. Researchers have long been concerned that models could misrepresent their CoT under certain conditions.
âWe find that GPTâ5.4 Thinkingâs ability to control its CoT is low, which is a positive property for safety, suggesting that the model lacks the ability to hide its reasoning and that CoT monitoring remains an effective safety tool,â OpenAI said.
Taking to social media platform X, the company described GPTâ5.4 as its most âfactual and efficientâ model, using fewer tokens while providing faster responses. In ChatGPT, GPTâ5.4 Thinking offers improved deep web research and better context retention over longer interactions, the company said.
â....and ohâyou can now interrupt the model and add instructions or adjust its direction mid-response,â it added.
GPTâ5.4: Key features
The API version of GPTâ5.4 will support context windows as large as 1 million tokens, the largest available from OpenAI to date.
OpenAI emphasised the modelâs improved token efficiency, noting it can solve the same problems with far fewer tokens than its predecessor. To put that in context, tokens refer to the fundamental, smallest units of data that AI models (especially Large Language Models) use to process, understand, and generate text or images.
âGPTâ5.4 is our most token efficient reasoning model yet, using significantly fewer tokens to solve problems when compared to GPTâ5.2âtranslating to reduced token usage and faster speeds,â OpenAI said in a blog post.
âWeâve designed GPTâ5.4 to be performant across a wide range of computer-use workloads. It is excellent at writing code to operate computers via libraries such as Playwright, as well as issuing mouse and keyboard commands in response to screenshots â¦.. Developers can even configure the modelâs safety behavior to suit different levels of risk tolerance by specifying custom confirmation policies,â it wrote.
GPTâ5.4 shows record-breaking benchmark performance, including top scores in computer use benchmarks OSWorld-Verified and WebArena Verified, according to OpenAI. It achieved 83% on OpenAIâs GDPval test for knowledge work.
OpenAI has also continued its efforts to reduce hallucinations and factual errors. GPTâ5.4 is 33% less likely to make errors in individual claims compared with GPTâ5.2, and overall responses are 18% less likely to contain mistakes.
Steerability
GPTâ5.4 Thinking in ChatGPT introduces a preamble for longer, more complex queries, similar to Codex (OpenAIâs coding agent, which understands and generates code from natural language). Users can add instructions or change the modelâs direction mid-response, making it easier to guide outputs without starting over or requiring multiple additional turns.
This feature is already available on ChatGPT and the Android app, with iOS access coming soon.
The model can also think for longer on difficult tasks while maintaining a strong awareness of earlier conversation steps. This allows it to handle longer workflows and more complex prompts while keeping responses coherent and relevant throughout.
Availability and pricing
GPTâ5.4 is being gradually rolled out from Friday across ChatGPT and Codex.
- ChatGPT Plus, Team, and Pro users now have access to GPTâ5.4 Thinking, which replaces GPTâ5.2 Thinking.
- GPTâ5.2 Thinking remains available for three months for paid users under the Legacy Models section, after which it will be retired on June 5, 2026.
- Enterprise and Education plans can enable early access via admin settings.
- GPTâ5.4 Pro is available for Pro and Enterprise plans.
- Context windows in ChatGPT for GPTâ5.4 Thinking remain unchanged from GPTâ5.2 Thinking.
Pricing:
Safety
A new safety evaluation has been added to examine the modelâs chain-of-thought (CoT), the running commentary used to explain reasoning in multi-step tasks. Researchers have long been concerned that models could misrepresent their CoT under certain conditions.
âWe find that GPTâ5.4 Thinkingâs ability to control its CoT is low, which is a positive property for safety, suggesting that the model lacks the ability to hide its reasoning and that CoT monitoring remains an effective safety tool,â OpenAI said.