Change the model version and settings

This article explains how to change the model version and settings in the prompt builder. The model version and settings can affect the performance and behavior of the generative AI model.

Model selection

You can change the model by selecting Model at the top of the prompt builder. The dropdown menu allows you to select from the generative AI models that generate answers to your custom prompt.

Important

In November 2025, we migrated the o3 model to the GPT-5 reasoning model. Prompts that ran on the o3 model were automatically transitioned to the GPT-5 reasoning model without action required from you. It's possible to revert temporarily to the o3 model by requesting it through a support request on prompts. This exception lasts until December 17, 2025, after which the o3 model will be permanently retired.

Using prompts in Power Apps or Power Automate consumes prompt builder credits, while using prompts in Microsoft Copilot Studio consumes Copilot Credits. Learn more in Licensing and prompt builder credits.

Overview

The following table describes the different models available.

GPT model	Licensing	Functionalities	Category
GPT-4.1 mini (Default model)	Basic rate	Trained on data up to June 2024. Input up to 128K tokens.	Mini
GPT-4.1	Standard rate	Trained on data up to June 2024. Context allowed up to 128K tokens.	General
GPT-5 chat	Standard rate	Trained on data up to September 2024. Context allowed up to 128K tokens.	General
GPT-5 reasoning	Premium rate	Trained on data up to September 2024. Context allowed up to 400K tokens.	Deep
GPT-5.2 chat (Experimental)	Standard rate	Context allowed up to 128K tokens.	General
GPT-5.2 reasoning (Experimental)	Premium rate	Trained on data up to October 2024. Context allowed up to 400K tokens.	Deep
Claude Sonnet 4.5 (Experimental)	Standard rate	External model from Anthropic. Context allowed up to 200K tokens.	General
Claude Opus 4.1 (Experimental)	Premium rate	External model from Anthropic. Context allowed up to 200K tokens.	Deep

GPT-4o mini and GPT-4o continue to be used in U.S. government regions. These models follow licensing rules and offer functionalities comparable to GPT-4.1 mini and GPT-4.1, respectively.

Model availability differs per region. Learn more about model availability in Model availability by region.

Anthropic models are hosted outside Microsoft and are subject to Anthropic terms and data handling. Learn more about external Anthropic models in Choose an external model as the primary AI model.

Licensing

In agents, flows, or apps, models used by prompts consume Copilot Credits, regardless of their release stage. Learn more in Copilot Credit management.

If you have AI Builder credits, they're consumed in priority when prompts are used in Power Apps and Power Automate. They aren't consumed when prompts are used in Copilot Studio. Learn more in AI Builder: Overview of licensing.

Release stages

Models have different stages of release. You can try out new, cutting-edge experimental and preview models, or choose a reliable, thoroughly tested generally available model.

Tag	Description
Experimental	Used for experimentation, and not recommended for production use. Subject to preview terms, and can have limitations on availability and quality.
Preview	Eventually becomes a generally available model, but currently isn't recommended for production use. Subject to preview terms, and can have limitations on availability and quality.
No tag	Generally available. You can use this model for scaled and production use. In most cases, generally available models have no limitations on availability and quality, but some might still have some limitations, like regional availability. Important: Anthropic Claude models are at the experimental stage, even though they don't display a tag.
Default	The default model for all agents, and usually the best performing generally available model. The default model is periodically upgraded as new, more capable models become generally available. Agents also use the default model as a fallback if a selected model is turned off or unavailable.

Experimental and preview models might show variability in performance, response quality, latency, or message consumption, and might time out or be unavailable. They're subject to preview terms.

Categorization

The following table describes the different model categories.

	Mini	General	Deep
Performance	Good for most tasks	Superior for complex tasks	Trained for reasoning tasks
Speed	Faster processing	Might be slower due to complexity	Slower, as it reasons before responding
Use cases	Summarization, information tasks, image and document processing	Image and document processing, advanced content creation tasks	Data analysis and reasoning tasks, image and document processing

When you need a cost-effective solution for moderately complex tasks, have limited computational resources, or require faster processing, choose Mini models. It's ideal for projects with budget constraints and applications like customer support or efficient code analysis.

When you're dealing with highly complex, multimodal tasks that require superior performance and detailed analysis, choose General models. It's the better choice for large-scale projects where accuracy and advanced capabilities are crucial. Another scenario where it's a better choice is when you have the budget and computational resources to support it. General models are also preferable for long-term projects that might grow in complexity over time.

For projects requiring advanced reasoning capabilities, Deep models excel. It's suitable for scenarios that demand sophisticated problem-solving and critical thinking. Deep models excel in environments where nuanced reasoning, complex decision-making, and detailed analysis are important.

Choose among the models based on region availability, functionalities, use cases, and costs. Learn more in Model availability by region, and Pricing comparison table.

Model availability by region

The following sections describe the public and US Government availability of models by region.

Public availability

In the following table, (GA), (Preview), or (Experimental) means that the feature is available but uses an Azure OpenAI service in another region. Learn more in enabling data movement cross-regions.

Feature	Asia	Australia	Canada	Europe	France	Germany	India	Japan	Norway	Singapore	South Africa	South America	Korea	Sweden	Switzerland	United Arab Emirates	United Kingdom	United States
GPT-4.1 mini	GA	GA	(GA)	(GA)	(GA)	(GA)	GA	(GA)	(GA)	GA	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	GA	GA
GPT-4.1	GA	GA	(GA)	(GA)	(GA)	(GA)	GA	(GA)	(GA)	GA	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	GA	GA
o3	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	GA
GPT-5 chat	(GA)	(GA)	(GA)	GA	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	GA
GPT-5 reasoning	(GA)	(GA)	(GA)	GA	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	(GA)	GA
GPT-5.2 chat	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	Experimental
GPT-5.2 reasoning	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	Experimental

US Government availability

Feature	Government Community Cloud (GCC)	Government Community Cloud – High (GCC High)	Department of Defense (DoD)
GPT-4o mini	GA	GA	-
GPT-4o	GA	GA	-
GPT-4o using image or document as input	-	-	-

Model updates

Model	Status	Retirement date	Replacement
GPT-4.1 mini	Generally available	No date yet	n/a
GPT-4.1	Generally available	No date yet	n/a
GPT-5 chat	Generally available	No date yet	n/a
GPT-5 reasoning	Generally available	No date yet	n/a
GPT-5.2 chat	Experimental	No date yet	n/a
GPT-5.2 reasoning	Experimental	No date yet	n/a
Claude Sonnet 4.5	Experimental	No date yet	n/a
Claude Opus 4.1	Experimental	December 2025	Claude Opus 4.5
Claude Opus 4.5	Pending availability	No date yet	n/a
o3	Retired	December 4, 2025	GPT-5 reasoning
GPT-4o mini	Retired	July 2025	GPT-4.1 mini
GPT-4o	Retired	July 2025	GPT-4.1
o1	Retired	July 2025	o3

Model settings

You can access the settings panel by selecting the three dots (…) > Settings at the top of the prompt builder. You can change the following settings:

Temperature: Lower temperatures lead to predictable results. Higher temperatures allow more diverse or creative responses.
Record retrieval: Number of records retrieved for your knowledge sources.
Include links in the response: When selected, the response includes link citations for the retrieved records.
Enable code interpreter: When selected, code interpreter to generate and execute code is enabled.
Content moderation level: The lowest level generates the most answers, but they might contain harmful content. The highest level of content moderation applies a stricter filter to restrict harmful content and generates fewer answers.

Temperature

The slider allows you to select the temperature of the generative AI model. It varies between 0 and 1. This value guides the generative AI model about how much creativity (1) vs deterministic answer (0) it should provide.

Note

The temperature setting isn't available for the GPT-5 reasoning model. For this reason, the slider is disabled when you select the GPT-5 reasoning model.

Temperature is a parameter that controls the randomness of the output generated by the AI model. A lower temperature results in more predictable and conservative outputs. To compare, a higher temperature allows for more creativity and diversity in the responses. It’s a way to fine-tune the balance between randomness and determinism in the model’s output.

By default, the temperature is 0, as in previously created prompts.

Temperature	Functionality	Use in
0	More predictable and conservative outputs. Responses are more consistent.	Prompts that require high accuracy and less variability.
1	More creativity and diversity in the responses. More varied and sometimes more innovative responses.	Prompts that create new out-of-the-box content.

Adjusting the temperature can influence the model’s output, but it doesn't guarantee a specific result. The AI's responses are inherently probabilistic and can vary with the same temperature setting.

Content moderation level

The slider allows you to select the content moderation level for the prompt to allow your prompt to provide more answers. However, the increase in answers might affect the allowance of harmful content (Hate and Fairness, Sexual, Violence, Self-Harm) from the prompt.

Note

The Content moderation level setting is available only for managed models. For this reason, the slider is disabled when you select Anthropic or Azure AI Foundry models.

The moderation levels range from Low to High. The default moderation level for prompts is Moderate.

Lower moderation increases the risk of harmful content in your prompt's responses. Higher moderation lowers that risk, but might reduce the number of responses.

Content moderation level	Description	Suggested use
Low	Might allow hate and fairness, sexual, violence, or self-harm content that displays explicit and severe harmful instructions, actions, damage, or abuse. Includes endorsement, glorification, or promotion of severe harmful acts, extreme or illegal forms of harm, radicalization, or nonconsensual power exchange or abuse.	Use for prompts processing data that could be considered as harmful content (for example, descriptions of violence or medical procedures).
Moderate	Might allow hate and fairness, sexual, violence, or self-harm content that uses offensive, insulting, mocking, intimidating, or demeaning language towards specific identity groups. Includes depictions of seeking and executing harmful instructions, fantasies, glorification, promotion of harm at medium intensity.	Default filtering. Appropriate for most uses.
High	Might allow hate and fairness, sexual, violence, or self-harm content that expresses prejudiced, judgmental, or opinionated views. Includes offensive use of language, stereotyping, use-cases exploring a fictional world (for example, gaming, literature), and depictions at low intensity.	Use if you need more filtering more restrictive than the Moderate level.

To override the content moderation setting of the agent when using the prompt in an agent, set the After running setting in the Completion screen of the prompt tool to Send specific response (specify below). The Message to display should contain the Output.predictionOutput.text custom variable.

Screenshot of the 'Completion' screen with the 'Send specific response (specify below)' setting.

Feedback

Was this page helpful?

Last updated on 2026-01-14