Deploy and use Claude models in Microsoft Foundry (preview)

This article explains how to deploy and use the latest Claude models in Foundry, including Claude Opus 4.5, Claude Sonnet 4.5, Claude Haiku 4.5, and Claude Opus 4.1. Anthropic's flagship product is Claude, a frontier AI model useful for complex tasks such as coding, agents, financial analysis, research, and office tasks. Claude delivers exceptional performance while maintaining high safety standards.

Available Claude models

Foundry supports Claude Opus 4.5, Claude Sonnet 4.5, Claude Haiku 4.5, and Claude Opus 4.1 models through global standard deployment. These models have key capabilities that include:

Extended thinking: Extended thinking gives Claude enhanced reasoning capabilities for complex tasks.
Image and text input: Strong vision capabilities that enable the models to process images and return text outputs for analyzing and understanding charts, graphs, technical diagrams, reports, and other visual assets.
Code generation: Advanced thinking that includes code generation, analysis, and debugging for Claude Sonnet 4.5 and Claude Opus 4.1.

For more details about the model capabilities, see capabilities of Claude models.

Claude Opus 4.5 (preview)

Claude Opus 4.5 is Anthropic's most intelligent model, and an industry leader across coding, agents, computer use, and enterprise workflows. With a 200K token context window and 64K max output, Opus 4.5 is ideal for production code, sophisticated agents, office tasks, financial analysis, cybersecurity, and computer use.

Claude Sonnet 4.5 (preview)

Claude Sonnet 4.5 is a highly capable model designed for building real-world agents and handling complex, long-horizon tasks. It offers a strong balance of speed and cost for high-volume use cases. Sonnet 4.5 also provides advanced accuracy for computer use, enabling developers to direct Claude to use computers the way people do.

Claude Haiku 4.5 (preview)

Claude Haiku 4.5 delivers near-frontier performance for a wide range of use cases. It stands out as one of the best coding and agent models, with the right speed and cost to power free products and scaled sub-agents.

Claude Opus 4.1 (preview)

Claude Opus 4.1 is an industry leader for coding. It delivers sustained performance on long-running tasks that require focused effort and thousands of steps, significantly expanding what AI agents can solve.

Prerequisites

An Azure subscription with a valid payment method. If you don't have an Azure subscription, create a paid Azure account to begin.
Access to Microsoft Foundry with appropriate permissions to create and manage resources.
A Microsoft Foundry project created in one of the supported regions: East US2 and Sweden Central.
Foundry Models from partners and community require access to Azure Marketplace to create subscriptions. Ensure you have the permissions required to subscribe to model offerings.

Deploy Claude models

Claude models in Foundry are available for global standard deployment. To deploy a Claude model, follow the instructions in Add and configure models to Microsoft Foundry Models.

After deployment, you can use the Foundry playground to interactively test the model.

Work with Claude models

Once deployed, you have some options for interacting with Claude models to generate text responses:

Use the Anthropic SDKs and the following Claude APIs:
- Messages API to send a structured list of input messages with text and/or image content, and the model generates the next message in the conversation.
- Token Count API to count the number of tokens in a message.
- Files API to upload and manage files to use with the Claude API without having to re-upload content with each request.
- Skills API to create custom skills for Claude AI.
Use the Responses API to generate text responses with Claude models in Microsoft Foundry. For multi-language code samples that demonstrate this usage, see Use Claude Models with OpenAI Responses API in Microsoft Foundry.

Use the Messages API to work with Claude models

The following examples show how to use the Messages API to send requests to Claude Sonnet 4.5, by using both Microsoft Entra ID authentication and API key authentication methods. To work with your deployed model, you need these items:

Your base URL, which is of the form https://<resource name>.services.ai.azure.com/anthropic.
Your target URI from your deployment details, which is of the form https://<resource name>.services.ai.azure.com/anthropic/v1/messages.
Microsoft Entra ID for keyless authentication or your deployment's API key for API authentication.
Deployment name you chose during deployment creation. This name can be different from the model ID.

Use Microsoft Entra ID authentication

For Messages API endpoints, use your base URL with Microsoft Entra ID authentication.

Install the Azure Identity client library: You need to install this library to use the DefaultAzureCredential. Authorization is easiest when you use DefaultAzureCredential, as it finds the best credential to use in its running environment.
```
pip install azure.identity
```
Set the values of the client ID, tenant ID, and client secret of the Microsoft Entra ID application as environment variables: AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_CLIENT_SECRET.
```
export AZURE_CLIENT_ID="<AZURE_CLIENT_ID>"
export AZURE_TENANT_ID="<AZURE_TENANT_ID>"
export AZURE_CLIENT_SECRET="<AZURE_CLIENT_SECRET>"
```
Install dependencies: Install the Anthropic SDK by using pip (requires: Python >=3.8).
```
pip install -U "anthropic"
```

Run a basic code sample: This sample completes the following tasks:

Creates a client with the Anthropic SDK, using Microsoft Entra ID authentication.
Makes a basic call to the Messages API. The call is synchronous.

from anthropic import AnthropicFoundry
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

baseURL = "https://<resource-name>.services.ai.azure.com/anthropic" # Your base URL. Replace <resource-name> with your resource name
deploymentName = "claude-sonnet-4-5" # Replace with your deployment name

# Create token provider for Entra ID authentication
tokenProvider = get_bearer_token_provider(
    DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
)

# Create client with Entra ID authentication
client = AnthropicFoundry(
    azure_ad_token_provider=tokenProvider,
    base_url=baseURL
)

# Send request
message = client.messages.create(
    model=deployment_name,
    messages=[
        {"role": "user", "content": "What is the capital/major city of France?"}
    ],
    max_tokens=1024,
)

print(message.content)

Use API key authentication

For Messages API endpoints, use your base URL and API key to authenticate against the service.

Install dependencies: Install the Anthropic SDK by using pip (requires: Python >=3.8):
```
pip install -U "anthropic"
```

Run a basic code sample: This sample completes the following tasks:

Creates a client with the Anthropic SDK by passing your API key to the SDK's configuration. This authentication method lets you interact seamlessly with the service.
Makes a basic call to the Messages API. The call is synchronous.

from anthropic import AnthropicFoundry

baseURL = "https://<resource-name>.services.ai.azure.com/anthropic" # Your base URL. Replace <resource-name> with your resource name
deploymentName = "claude-sonnet-4-5" # Replace with your deployment name
apiKey = "YOUR_API_KEY" # Replace YOUR_API_KEY with your API key

# Create client with API key authentication
client = AnthropicFoundry(
    api_key=apiKey,
    base_url=baseURL
)

# Send request
message = client.messages.create(
    model=deploymentName,
    messages=[
        {"role": "user", "content": "What is the capital/major city of France?"}
    ],
    max_tokens=1024,
)

print(message.content)

Use Microsoft Entra ID authentication

For Messages API endpoints, use your base URL with Microsoft Entra ID authentication.

Install the Azure Identity client library: Install the @azure/identity package to use the DefaultAzureCredential. Authorization is easiest when you use DefaultAzureCredential, as it finds the best credential to use in its running environment.
```
npm install @azure/identity
```
Set the values of the client ID, tenant ID, and client secret of the Microsoft Entra ID application as environment variables: AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_CLIENT_SECRET.
```
export AZURE_CLIENT_ID="<AZURE_CLIENT_ID>"
export AZURE_TENANT_ID="<AZURE_TENANT_ID>"
export AZURE_CLIENT_SECRET="<AZURE_CLIENT_SECRET>"
```
Install dependencies
1. Install Node.js 20 LTS or later (non-EOL) versions.
2. Copy the following lines of text and save them as a file package.json inside your folder.
```
{
  "type": "module",
  "dependencies": {
    "@anthropic-ai/sdk": "latest",
    "@azure/identity": "latest"
  }
}
```
  Note
  
  @azure/core-sse is only needed when you stream the response.
3. Open a terminal window in this folder and run npm install.
4. For each of the code snippets that follow, copy the content into a file sample.js and run with node sample.js.

Run a basic code sample. This sample completes the following tasks:

Creates a client with the Anthropic SDK, using Microsoft Entra ID authentication.
Makes a basic call to the Messages API. The call is synchronous.

import AnthropicFoundry from '@anthropic-ai/foundry-sdk';
import { getBearerTokenProvider, DefaultAzureCredential } from "@azure/identity";

const baseURL = "https://<resource-name>.services.ai.azure.com/anthropic"; // Your base URL. Replace <resource-name> with your resource name
const deploymentName = "claude-sonnet-4-5" // Replace with your deployment name

// Create token provider for Entra ID authentication
const tokenProvider = getBearerTokenProvider(
    new DefaultAzureCredential(),
    'https://cognitiveservices.azure.com/.default');

// Create client with Entra ID authentication
const client = new AnthropicFoundry({
    azureADTokenProvider: tokenProvider,
    baseURL: baseURL,
    apiVersion: "2023-06-01"
});

// Send request
const message = await client.messages.create({
    model: deploymentName,
    messages: [{ role: "user", content: "What is the capital/major city of France?" }],
    max_tokens: 1024,
});
console.log(message);

Use API key authentication

For Messages API endpoints, use your base URL and API key to authenticate against the service.

Install dependencies
1. Install Node.js 20 LTS or later (non-EOL) versions.
2. Copy the following lines of text and save them as a file package.json inside your folder.
```
{
  "type": "module",
  "dependencies": {
    "@anthropic-ai/sdk": "latest"
  }
}
```
  Note
  
  @azure/core-sse is only needed when you stream the response.
3. Open a terminal window in this folder and run npm install.
4. For each of the code snippets that follow, copy the content into a file sample.js and run with node sample.js.

Run a basic code sample. This sample completes the following tasks:

Creates a client with the Anthropic SDK by passing your API key to the SDK's configuration. This authentication method lets you interact seamlessly with the service.
Makes a basic call to the Messages API. The call is synchronous.

import AnthropicFoundry from '@anthropic-ai/foundry-sdk';

const baseURL = "https://<resource-name>.services.ai.azure.com/anthropic"; // Your base URL. Replace <resource-name> with your resource name
const deploymentName = "claude-sonnet-4-5" // Replace with your deployment name
const apiKey = "<your-api-key>"; // Your API key

// Create client with API key
const client = new AnthropicFoundry({
    apiKey: apiKey,
    baseURL: baseURL,
    apiVersion: "2023-06-01"
});

// Send request
const message = await client.messages.create({
    model: deploymentName,
    messages: [{ role: "user", content: "What is the capital/major city of France?" }],
    max_tokens: 1024,
});
console.log(message);

For a list of supported runtimes, see Requirements to use Anthropic TypeScript API Library.

Use Microsoft Entra ID authentication

For Messages API endpoints, use the deployed model's endpoint URI https://<resource-name>.services.ai.azure.com/anthropic/v1/messages with Microsoft Entra ID authentication.

If you configure the resource with Microsoft Entra ID support, pass your token in the Authorization header with the format Bearer $AZURE_AUTH_TOKEN. Use scope https://cognitiveservices.azure.com/.default. Using Microsoft Entra ID might require additional configuration in your resource to grant access. For more information, see configure authentication with Microsoft Entra ID.

Export your Microsoft Entra ID token to an environment variable:

If you're using bash:
```
export AZURE_AUTH_TOKEN="<your-entra-id-key>"
```
If you're in PowerShell:
```
$Env:AZURE_AUTH_TOKEN = "<your-entra-id-key>"
```
If you're using Windows command prompt:
```
set AZURE_AUTH_TOKEN = <your-entra-id-key>
```

Run the following cURL command. For cURL, use your deployment's target URI https://<resource-name>.services.ai.azure.com/anthropic/v1/messages.

curl -X POST https://<resource-name>.services.ai.azure.com/anthropic/v1/messages \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_AUTH_TOKEN" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "messages": [
      {
        "role": "system", "content": "You are a helpful assistant."
      },
      {
        "role": "user", "content": "What are 3 things to visit in Seattle?"
      }
    ],
    "max_tokens": 1000,
    "temperature": 0.7,
    "model": "claude-sonnet-4-5"
    }'

Use API key authentication

For Messages API endpoints, use the deployed model's endpoint URI https://<resource-name>.services.ai.azure.com/anthropic/v1/messages and API key to authenticate against the service.

Export your API key to an environment variable:

If you're using bash:
```
export AZURE_API_KEY="<your-api-key>"
```
If you're in PowerShell:
```
$Env:AZURE_API_KEY = "<your-api-key>"
```
If you're using Windows command prompt:
```
set AZURE_API_KEY = <your-api-key>
```

Run the following cURL command:

curl -X POST https://<resource-name>.services.ai.azure.com/anthropic/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: $AZURE_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "messages": [
      {
        "role": "system", "content": "You are a helpful assistant."
      },
      {
        "role": "user", "content": "What are 3 things to visit in Seattle?"
      }
    ],
    "max_tokens": 1000,
    "temperature": 0.7,
    "model": "claude-sonnet-4-5"
    }'

Agent support

Foundry Agent Service supports Claude models.
Microsoft Agent Framework supports creating agents that use Claude models.
You can build custom AI agents with the Claude Agent SDK.

Claude advanced features and capabilities

Claude in Foundry Models supports advanced features and capabilities. Core capabilities enhance Claude's fundamental abilities for processing, analyzing, and generating content across various formats and use cases. Tools enable Claude to interact with external systems, execute code, and perform automated tasks through various tool interfaces.

Some of the Core capabilities that Foundry supports are:

1 million token context window: An extended context window.
Agent skills: Extend Claude's capabilities with Skills.
Citations: Ground Claude's responses in source documents.
Context editing: Automatically manage conversation context with configurable strategies.
Extended thinking: Enhanced reasoning capabilities for complex tasks.
PDF support: Process and analyze text and visual content from PDF documents.
Prompt caching: Provide Claude with more background knowledge and example outputs to reduce costs and latency.

Some of the Tools that Foundry supports are:

MCP connector: Connect to remote MCP servers directly from the Messages API without a separate MCP client.
Memory: Store and retrieve information across conversations. Build knowledge bases over time, maintain project context, and learn from past interactions.
Web fetch: Retrieve full content from specified web pages and PDF documents for in-depth analysis.

For a full list of the supported capabilities and tools, see Claude's features overview.

API quotas and limits

Claude models in Foundry have the following rate limits, measured in Tokens Per Minute (TPM) and Requests Per Minute (RPM):

Model	Deployment Type	Default RPM	Default TPM	Enterprise and MCA-E RPM	Enterprise and MCA-E TPM
claude-haiku-4-5	GlobalStandard	1,000	1,000,000	4,000	4,000,000
claude-opus-4-1	GlobalStandard	1,000	1,000,000	2,000	2,000,000
claude-sonnet-4-5	GlobalStandard	1,000	1,000,000	4,000	2,000,000
claude-opus-4-5	Global Standard	1,000	1,000,000	2,000	2,000,000

To increase your quota beyond the default limits, submit a request through the quota increase request form.

Rate limit best practices

To optimize your usage and avoid rate limiting:

Implement retry logic: Handle 429 responses with exponential backoff
Batch requests: Combine multiple prompts when possible
Monitor usage: Track your token consumption and request patterns
Use appropriate models: Choose the right Claude model for your use case

Responsible AI considerations

When using Claude models in Foundry, consider these responsible AI practices:

Configure AI content safety during model inference, as Foundry doesn't provide built-in content filtering for Claude models at deployment time. To learn how to create and use content filters, see Configure content filtering for Foundry Models.
Ensure your applications comply with Anthropic's Acceptable Use Policy. Also, see details of safety evaluations for Claude Opus 4.5, Claude Haiku 4.5, Claude Opus 4.1, and Claude Sonnet 4.5.

Configure AI content safety during model inference, as Foundry doesn't provide built-in content filtering for Claude models at deployment time.
Ensure your applications comply with Anthropic's Acceptable Use Policy. Also, see details of safety evaluations for Claude Opus 4.5, Claude Haiku 4.5, Claude Opus 4.1, and Claude Sonnet 4.5.

Best practices

Follow these best practices when working with Claude models in Foundry:

Model selection

Choose the appropriate Claude model based on your specific requirements:

Claude Opus 4.5: For best performance across coding, agents, computer use, and enterprise workflows
Claude Sonnet 4.5: For balanced performance and capabilities, production workflows
Claude Haiku 4.5: For speed and cost optimization, high-volume processing
Claude Opus 4.1: For complex reasoning and enterprise applications

Prompt engineering

Clear instructions: Provide specific and detailed prompts
Context management: Effectively use the available context window
Role definitions: Use system messages to define the assistant's role and behavior
Structured prompts: Use consistent formatting for better results

Cost optimization

Token management: Monitor and optimize token usage
Model selection: Use the most cost-effective model for your use case
Caching: Implement explicit prompt caching where appropriate
Request batching: Combine multiple requests when possible

Feedback

Was this page helpful?

Last updated on 2025-12-02

Share via

Deploy and use Claude models in Microsoft Foundry (preview)

Available Claude models

Claude Opus 4.5 (preview)

Claude Sonnet 4.5 (preview)

Claude Haiku 4.5 (preview)

Claude Opus 4.1 (preview)

Prerequisites

Deploy Claude models

Work with Claude models

Use the Messages API to work with Claude models

Use Microsoft Entra ID authentication

Use API key authentication

Agent support

Claude advanced features and capabilities

API quotas and limits

Rate limit best practices

Responsible AI considerations

Best practices

Model selection

Prompt engineering

Cost optimization

Related content

Feedback

Additional resources