Foundry Models from partners and community

Note

This document refers to the Microsoft Foundry (classic) portal.

🔄 Switch to the Microsoft Foundry (new) documentation if you're using the new portal.

Note

This document refers to the Microsoft Foundry (new) portal.

This article lists a selection of Microsoft Foundry Models from partners and community along with their capabilities, deployment types, and regions of availability, excluding deprecated and legacy models. Most Foundry Models come from partners and community. Trusted third-party organizations, partners, research labs, and community contributors provide these models.

Depending on the kind of project you use in Microsoft Foundry, you see a different selection of models. To learn more about attributes of Foundry Models from partners and community, see Explore Foundry Models.

Note

For a list of models sold directly by Azure, see Foundry Models sold directly by Azure.

For a list of Azure OpenAI models that are supported by the Foundry Agent Service, see Models supported by Agent Service.

Anthropic

Anthropic's flagship product is Claude, a frontier AI model trusted by leading enterprises and millions of users worldwide for complex tasks including coding, agents, financial analysis, research, and office tasks. Claude delivers exceptional performance while maintaining high safety standards.

To work with Claude models in Foundry, see Deploy and use Claude models in Microsoft Foundry.

Claude models are also supported for use in the Foundry Agent Service.

Model	Type	Capabilities	Project type
claude-haiku-4-5 (Preview)	Messages	- Input: text and image - Output: text (64,000 max tokens) - Context window: 200,000 - Languages: `en`, `fr`, `ar`, `zh`, `ja`, `ko`, `es`, `hi` - Tool calling: Yes (file search and code execution) - Response formats: Text, JSON	Foundry, Hub-based
claude-opus-4-1 (Preview)	Messages	- Input: text, image, and code - Output: text (32,000 max tokens) - Context window: 200,000 - Languages: `en`, `fr`, `ar`, `zh`, `ja`, `ko`, `es`, `hi` - Tool calling: Yes (file search and code execution) - Response formats: Text, JSON	Foundry, Hub-based
claude-sonnet-4-5 (Preview)	Messages	- Input: text, image, and code - Output: text (max 64,000 tokens) - Context window: 200,000 - Languages: `en`, `fr`, `ar`, `zh`, `ja`, `ko`, `es`, `hi` - Tool calling: Yes (file search and code execution) - Response formats: Text, JSON	Foundry, Hub-based
claude-opus-4-5 (Preview)	Messages	- Input: text and imag, and code - Output: text (64,000 max tokens) - Context window: 200,000 - Languages: `en`, `fr`, `ar`, `zh`, `ja`, `ko`, `es`, `hi` - Tool calling: Yes (file search and code execution) - Response formats: Text, JSON	Foundry, Hub-based

Model	Type	Capabilities
`claude-haiku-4-5` (Preview)	Messages	- Input: text and image - Output: text (64,000 max tokens) - Context window: 200,000 - Languages: `en`, `fr`, `ar`, `zh`, `ja`, `ko`, `es`, `hi` - Tool calling: Yes (file search and code execution) - Response formats: Text, JSON
`claude-opus-4-1` (Preview)	Messages	- Input: text, image, and code - Output: text (32,000 max tokens) - Context window: 200,000 - Languages: `en`, `fr`, `ar`, `zh`, `ja`, `ko`, `es`, `hi` - Tool calling: Yes (file search and code execution) - Response formats: Text, JSON
`claude-sonnet-4-5` (Preview)	Messages	- Input: text, image, and code - Output: text (max 64,000 tokens) - Context window: 200,000 - Languages: `en`, `fr`, `ar`, `zh`, `ja`, `ko`, `es`, `hi` - Tool calling: Yes (file search and code execution) - Response formats: Text, JSON
`claude-opus-4-5` (Preview)	Messages	- Input: text and imag, and code - Output: text (64,000 max tokens) - Context window: 200,000 - Languages: `en`, `fr`, `ar`, `zh`, `ja`, `ko`, `es`, `hi` - Tool calling: Yes (file search and code execution) - Response formats: Text, JSON

See the Anthropic model collection in the Foundry portal.

Cohere

The Cohere family of models includes various models optimized for different use cases, including chat completions and embeddings. Cohere models are optimized for various use cases that include reasoning, summarization, and question answering.

Model	Type	Capabilities	Project type
Cohere-command-r-plus-08-2024	chat-completion	- Input: text (131,072 tokens) - Output: text (4,096 tokens) - Languages: `en`, `fr`, `es`, `it`, `de`, `pt-br`, `ja`, `ko`, `zh-cn`, and `ar` - Tool calling: Yes - Response formats: Text, JSON	Foundry, Hub-based
Cohere-command-r-08-2024	chat-completion	- Input: text (131,072 tokens) - Output: text (4,096 tokens) - Languages: `en`, `fr`, `es`, `it`, `de`, `pt-br`, `ja`, `ko`, `zh-cn`, and `ar` - Tool calling: Yes - Response formats: Text, JSON	Foundry, Hub-based
Cohere-embed-v3-english	embeddings	- Input: text and images (512 tokens) - Output: Vector (1024 dim.) - Languages: `en`	Foundry, Hub-based
Cohere-embed-v3-multilingual	embeddings	- Input: text (512 tokens) - Output: Vector (1024 dim.) - Languages: `en`, `fr`, `es`, `it`, `de`, `pt-br`, `ja`, `ko`, `zh-cn`, and `ar`	Foundry, Hub-based

Model	Type	Capabilities
`Cohere-command-r-plus-08-2024`	chat-completion	- Input: text (131,072 tokens) - Output: text (4,096 tokens) - Languages: `en`, `fr`, `es`, `it`, `de`, `pt-br`, `ja`, `ko`, `zh-cn`, and `ar` - Tool calling: Yes - Response formats: Text, JSON
`Cohere-command-r-08-2024`	chat-completion	- Input: text (131,072 tokens) - Output: text (4,096 tokens) - Languages: `en`, `fr`, `es`, `it`, `de`, `pt-br`, `ja`, `ko`, `zh-cn`, and `ar` - Tool calling: Yes - Response formats: Text, JSON
`Cohere-embed-v3-english`	embeddings	- Input: text and images (512 tokens) - Output: Vector (1024 dim.) - Languages: `en`
`Cohere-embed-v3-multilingual`	embeddings	- Input: text (512 tokens) - Output: Vector (1024 dim.) - Languages: `en`, `fr`, `es`, `it`, `de`, `pt-br`, `ja`, `ko`, `zh-cn`, and `ar`

Cohere rerank

Model	Type	Capabilities	API Reference	Project type
Cohere-rerank-v3.5	rerank text classification	- Input: text - Output: text - Languages: English, Chinese, French, German, Indonesian, Italian, Portuguese, Russian, Spanish, Arabic, Dutch, Hindi, Japanese, Vietnamese	Cohere's v2/rerank API	Hub-based

For more details on pricing for Cohere rerank models, see Pricing for Cohere rerank models.

See the Cohere model collection in Foundry portal.

Core42

Core42 includes autoregressive bilingual LLMs for Arabic and English with state-of-the-art capabilities in Arabic.

Model	Type	Capabilities	Project type
jais-30b-chat	chat-completion	- Input: text (8,192 tokens) - Output: (4,096 tokens) - Languages: en and ar - Tool calling: Yes - Response formats: Text, JSON	Foundry, Hub-based

Model	Type	Capabilities
`jais-30b-chat`	chat-completion	- Input: text (8,192 tokens) - Output: (4,096 tokens) - Languages: en and ar - Tool calling: Yes - Response formats: Text, JSON

See this model collection in Foundry portal.

Model	Type	Capabilities	Project type
Llama-3.2-11B-Vision-Instruct	chat-completion	- Input: text and image (128,000 tokens) - Output: (8,192 tokens) - Languages: `en` - Tool calling: No - Response formats: Text	Foundry, Hub-based
Llama-3.2-90B-Vision-Instruct	chat-completion	- Input: text and image (128,000 tokens) - Output: (8,192 tokens) - Languages: `en` - Tool calling: No - Response formats: Text	Foundry, Hub-based
Meta-Llama-3.1-405B-Instruct	chat-completion	- Input: text (131,072 tokens) - Output: (8,192 tokens) - Languages: `en`, `de`, `fr`, `it`, `pt`, `hi`, `es`, and `th` - Tool calling: No - Response formats: Text	Foundry, Hub-based
Meta-Llama-3.1-8B-Instruct	chat-completion	- Input: text (131,072 tokens) - Output: (8,192 tokens) - Languages: `en`, `de`, `fr`, `it`, `pt`, `hi`, `es`, and `th` - Tool calling: No - Response formats: Text	Foundry, Hub-based
Llama-4-Scout-17B-16E-Instruct	chat-completion	- Input: text and image (128,000 tokens) - Output: text (8,192 tokens) - Tool calling: No - Response formats: Text	Foundry, Hub-based

Model	Type	Capabilities
`Llama-3.2-11B-Vision-Instruct`	chat-completion	- Input: text and image (128,000 tokens) - Output: (8,192 tokens) - Languages: `en` - Tool calling: No - Response formats: Text
`Llama-3.2-90B-Vision-Instruct`	chat-completion	- Input: text and image (128,000 tokens) - Output: (8,192 tokens) - Languages: `en` - Tool calling: No - Response formats: Text
`Meta-Llama-3.1-405B-Instruct`	chat-completion	- Input: text (131,072 tokens) - Output: (8,192 tokens) - Languages: `en`, `de`, `fr`, `it`, `pt`, `hi`, `es`, and `th` - Tool calling: No - Response formats: Text
`Meta-Llama-3.1-8B-Instruct`	chat-completion	- Input: text (131,072 tokens) - Output: (8,192 tokens) - Languages: `en`, `de`, `fr`, `it`, `pt`, `hi`, `es`, and `th` - Tool calling: No - Response formats: Text
`Llama-4-Scout-17B-16E-Instruct`	chat-completion	- Input: text and image (128,000 tokens) - Output: text (8,192 tokens) - Tool calling: No - Response formats: Text

Microsoft

Microsoft models include various model groups such as MAI models, Phi models, healthcare AI models, and more.

Model	Type	Capabilities	Project type
Phi-4-mini-instruct	chat-completion	- Input: text (131,072 tokens) - Output: (4,096 tokens) - Languages: `ar`, `zh`, `cs`, `da`, `nl`, `en`, `fi`, `fr`, `de`, `he`, `hu`, `it`, `ja`, `ko`, `no`, `pl`, `pt`, `ru`, `es`, `sv`, `th`, `tr`, and `uk` - Tool calling: No - Response formats: Text	Foundry, Hub-based
Phi-4-multimodal-instruct	chat-completion	- Input: text, images, and audio (131,072 tokens) - Output: (4,096 tokens) - Languages: `ar`, `zh`, `cs`, `da`, `nl`, `en`, `fi`, `fr`, `de`, `he`, `hu`, `it`, `ja`, `ko`, `no`, `pl`, `pt`, `ru`, `es`, `sv`, `th`, `tr`, and `uk` - Tool calling: No - Response formats: Text	Foundry, Hub-based
Phi-4	chat-completion	- Input: text (16,384 tokens) - Output: (16,384 tokens) - Languages: `en`, `ar`, `bn`, `cs`, `da`, `de`, `el`, `es`, `fa`, `fi`, `fr`, `gu`, `ha`, `he`, `hi`, `hu`, `id`, `it`, `ja`, `jv`, `kn`, `ko`, `ml`, `mr`, `nl`, `no`, `or`, `pa`, `pl`, `ps`, `pt`, `ro`, `ru`, `sv`, `sw`, `ta`, `te`, `th`, `tl`, `tr`, `uk`, `ur`, `vi`, `yo`, and `zh` - Tool calling: No - Response formats: Text	Foundry, Hub-based
Phi-4-reasoning	chat-completion with reasoning content	- Input: text (32,768 tokens) - Output: text (32,768 tokens) - Languages: `en` - Tool calling: No - Response formats: Text	Foundry, Hub-based
Phi-4-mini-reasoning	chat-completion with reasoning content	- Input: text (128,000 tokens) - Output: text (128,000 tokens) - Languages: `en` - Tool calling: No - Response formats: Text	Foundry, Hub-based

Model	Type	Capabilities
`Phi-4-mini-instruct`	chat-completion	- Input: text (131,072 tokens) - Output: (4,096 tokens) - Languages: `ar`, `zh`, `cs`, `da`, `nl`, `en`, `fi`, `fr`, `de`, `he`, `hu`, `it`, `ja`, `ko`, `no`, `pl`, `pt`, `ru`, `es`, `sv`, `th`, `tr`, and `uk` - Tool calling: No - Response formats: Text
`Phi-4-multimodal-instruct`	chat-completion	- Input: text, images, and audio (131,072 tokens) - Output: (4,096 tokens) - Languages: `ar`, `zh`, `cs`, `da`, `nl`, `en`, `fi`, `fr`, `de`, `he`, `hu`, `it`, `ja`, `ko`, `no`, `pl`, `pt`, `ru`, `es`, `sv`, `th`, `tr`, and `uk` - Tool calling: No - Response formats: Text
`Phi-4`	chat-completion	- Input: text (16,384 tokens) - Output: (16,384 tokens) - Languages: `en`, `ar`, `bn`, `cs`, `da`, `de`, `el`, `es`, `fa`, `fi`, `fr`, `gu`, `ha`, `he`, `hi`, `hu`, `id`, `it`, `ja`, `jv`, `kn`, `ko`, `ml`, `mr`, `nl`, `no`, `or`, `pa`, `pl`, `ps`, `pt`, `ro`, `ru`, `sv`, `sw`, `ta`, `te`, `th`, `tl`, `tr`, `uk`, `ur`, `vi`, `yo`, and `zh` - Tool calling: No - Response formats: Text
`Phi-4-reasoning`	chat-completion with reasoning content	- Input: text (32,768 tokens) - Output: text (32,768 tokens) - Languages: `en` - Tool calling: No - Response formats: Text
`Phi-4-mini-reasoning`	chat-completion with reasoning content	- Input: text (128,000 tokens) - Output: text (128,000 tokens) - Languages: `en` - Tool calling: No - Response formats: Text

See the Microsoft model collection in Foundry portal. Microsoft models are also available as models sold directly by Azure.

Mistral AI

Mistral AI offers two categories of models: premium models such as Mistral Large 2411 and Ministral 3B, and open models such as Mistral Nemo.

Model	Type	Capabilities	Project type
Codestral-2501	chat-completion	- Input: text (262,144 tokens) - Output: text (4,096 tokens) - Languages: en - Tool calling: No - Response formats: Text	Foundry, Hub-based
Ministral-3B	chat-completion	- Input: text (131,072 tokens) - Output: text (4,096 tokens) - Languages: fr, de, es, it, and en - Tool calling: Yes - Response formats: Text, JSON	Foundry, Hub-based
Mistral-Nemo	chat-completion	- Input: text (131,072 tokens) - Output: text (4,096 tokens) - Languages: `en`, `fr`, `de`, `es`, `it`, `zh`, `ja`, `ko`, `pt`, `nl`, and `pl` - Tool calling: Yes - Response formats: Text, JSON	Foundry, Hub-based
Mistral-small-2503	chat-completion	- Input: text (32,768 tokens) - Output: text (4,096 tokens) - Languages: fr, de, es, it, and en - Tool calling: Yes - Response formats: Text, JSON	Foundry, Hub-based
Mistral-medium-2505	chat-completion	- Input: text (128,000 tokens), image - Output: text (128,000 tokens) - Tool calling: No - Response formats: Text, JSON	Foundry, Hub-based
Mistral-Large-2411	chat-completion	- Input: text (128,000 tokens) - Output: text (4,096 tokens) - Languages: `en`, `fr`, `de`, `es`, `it`, `zh`, `ja`, `ko`, `pt`, `nl`, and `pl` - Tool calling: Yes - Response formats: Text, JSON	Foundry, Hub-based
Mistral-OCR-2503	image to text	- Input: image or PDF pages (1,000 pages, max 50MB PDF file) - Output: text - Tool calling: No - Response formats: Text, JSON, Markdown	Hub-based
mistralai-Mistral-7B-Instruct-v01	chat-completion	- Input: text - Output: text - Languages: en - Response formats: Text	Hub-based
mistralai-Mistral-7B-Instruct-v0-2	chat-completion	- Input: text - Output: text - Languages: en - Response formats: Text	Hub-based
mistralai-Mixtral-8x7B-Instruct-v01	chat-completion	- Input: text - Output: text - Languages: en - Response formats: Text	Hub-based
mistralai-Mixtral-8x22B-Instruct-v0-1	chat-completion	- Input: text (64,000 tokens) - Output: text (4,096 tokens) - Languages: fr, it, de, es, en - Response formats: Text	Hub-based

Model	Type	Capabilities
`Codestral-2501`	chat-completion	- Input: text (262,144 tokens) - Output: text (4,096 tokens) - Languages: en - Tool calling: No - Response formats: Text
`Ministral-3B`	chat-completion	- Input: text (131,072 tokens) - Output: text (4,096 tokens) - Languages: fr, de, es, it, and en - Tool calling: Yes - Response formats: Text, JSON
`Mistral-Nemo`	chat-completion	- Input: text (131,072 tokens) - Output: text (4,096 tokens) - Languages: `en`, `fr`, `de`, `es`, `it`, `zh`, `ja`, `ko`, `pt`, `nl`, and `pl` - Tool calling: Yes - Response formats: Text, JSON
`Mistral-small-2503`	chat-completion	- Input: text (32,768 tokens) - Output: text (4,096 tokens) - Languages: fr, de, es, it, and en - Tool calling: Yes - Response formats: Text, JSON
`Mistral-medium-2505`	chat-completion	- Input: text (128,000 tokens), image - Output: text (128,000 tokens) - Tool calling: No - Response formats: Text, JSON
`Mistral-Large-2411`	chat-completion	- Input: text (128,000 tokens) - Output: text (4,096 tokens) - Languages: `en`, `fr`, `de`, `es`, `it`, `zh`, `ja`, `ko`, `pt`, `nl`, and `pl` - Tool calling: Yes - Response formats: Text, JSON

See this model collection in Foundry portal. Mistral models are also available as models sold directly by Azure.

Nixtla

Nixtla's TimeGEN-1 is a generative pretrained forecasting and anomaly detection model for time series data. TimeGEN-1 produces accurate forecasts for new time series without training, using only historical values and exogenous covariates as inputs.

To perform inferencing, TimeGEN-1 requires you to use Nixtla's custom inference API.

Model	Type	Capabilities	Inference API	Project type
TimeGEN-1	Forecasting	- Input: Time series data as JSON or dataframes (with support for multivariate input) - Output: Time series data as JSON - Tool calling: No - Response formats: JSON	Forecast client to interact with Nixtla's API	Hub-based

For more details on pricing for Nixtla models, see Nixtla.

See this model collection in Foundry portal.

NTT Data

tsuzumi is an autoregressive language-optimized transformer. The tuned versions use supervised fine-tuning (SFT). tsuzumi handles both Japanese and English language with high efficiency.

Model	Type	Capabilities	Project type
tsuzumi-7b	chat-completion	- Input: text (8,192 tokens) - Output: text (8,192 tokens) - Languages: `en` and `jp` - Tool calling: No - Response formats: Text	Hub-based

See this model collection in Foundry portal.

Stability AI

The Stability AI collection of image generation models includes Stable Image Core, Stable Image Ultra, and Stable Diffusion 3.5 Large. Stable Diffusion 3.5 Large accepts both image and text input.

Model	Type	Capabilities	Project type
Stable Diffusion 3.5 Large	Image generation	- Input: text and image (1,000 tokens and 1 image) - Output: One Image - Tool calling: No - Response formats: Image (PNG and JPG)	Foundry, Hub-based
Stable Image Core	Image generation	- Input: text (1,000 tokens) - Output: One Image - Tool calling: No - Response formats: Image (PNG and JPG)	Foundry, Hub-based
Stable Image Ultra	Image generation	- Input: text (1,000 tokens) - Output: One Image - Tool calling: No - Response formats: Image (PNG and JPG)	Foundry, Hub-based

Model	Type	Capabilities
`Stable Diffusion 3.5 Large`	Image generation	- Input: text and image (1,000 tokens and 1 image) - Output: One Image - Tool calling: No - Response formats: Image (PNG and JPG)
`Stable Image Core`	Image generation	- Input: text (1,000 tokens) - Output: One Image - Tool calling: No - Response formats: Image (PNG and JPG)
`Stable Image Ultra`	Image generation	- Input: text (1,000 tokens) - Output: One Image - Tool calling: No - Response formats: Image (PNG and JPG)

See this model collection in Foundry portal.

Open and custom models

The model catalog offers a larger selection of models from a wider range of providers. For these models, you can't use the option for standard deployment in Microsoft Foundry resources, where models are provided as APIs. Instead, to deploy these models, you might need to host them on your infrastructure, create an AI hub, and provide the underlying compute quota to host the models.

Furthermore, these models can be open-access or IP protected. In both cases, you have to deploy them in managed compute offerings in Foundry. To get started, see How-to: Deploy to Managed compute.

Serverless API inference examples for Foundry Models

Feedback

Was this page helpful?

Last updated on 2025-11-24

Share via

Foundry Models from partners and community

Anthropic

Cohere

Cohere rerank

Core42

Meta

Microsoft

Mistral AI

Nixtla

NTT Data

Stability AI

Open and custom models

Related content

Feedback

Additional resources