Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Use the speech to text REST API for fast transcription, batch transcription, and custom speech. This article describes changes from version 2024-11-15 to version 2025-10-15.
Important
Speech to text REST API version 2025-10-15 is the latest version that's generally available.
- Speech to text REST API version
2024-05-15-previewwill be retired on a date to be announced. - Speech to text REST API
v3.0,v3.1,v3.2,3.2-preview.1, and3.2-preview.2will be retired on March 31, 2026.
For more information about upgrading, see the Speech to text REST API v3.0 to v3.1, v3.1 to v3.2, and v3.2 to 2024-11-15 migration guides.
To summarize the changes in this version:
- The Transcribe API has new features enhanced mode, and phrase list.
- The Projects API returns (absent in version 2024-11-15), and has some changes.
Transcription API changes
Request structure
- New endpoint:
POST <your_endpoint>/speechtotext/transcriptions:transcribe?api-version=2025-10-15 - Headers and form data:
Content-Type: multipart/form-dataOcp-Apim-Subscription-Key: $KEY- Form fields:
definition,audio
Example:
curl --request POST \
--url '<your_endpoint>/speechtotext/transcriptions:transcribe?api-version=2025-10-15' \
--header 'Content-Type: multipart/form-data' \
--header 'Ocp-Apim-Subscription-Key: $KEY' \
--form 'definition=$DEFINITION' \
--form 'audio=@C:\workspace\audios\test.wav'
Definition object updates
- Removed:
"models"dictionary (no longer in request definition)
- Added:
"disfluencyRemoval"(boolean): Removes filler words (such as "um" and "uh")"phraseList": Now supportsbiasingWeightfor recognition bias tuning"enhancedMode"object includes:enabled(boolean)task(such as"translate")targetLanguage(such as"ko")prompt(array of instructions or lexical boosts)
Example:
{
"locales": ["en-US"],
"profanityFilterMode": "Masked",
"diarization": {
"enabled": true,
"maxSpeakers": 6
},
"channels": [0],
"disfluencyRemoval": true,
"enhancedMode": {
"enabled": true,
"task": "translate",
"targetLanguage": "ko",
"prompt": [
"Provide lexical output",
"Boost the terms: CONTOSO, AAZZ; Replace ‘50cents’ to ’50-Cents’"
]
},
"phraseList": {
"phrases": ["Kenichi Kumatani", "John McDonough", "Bhiksha Raj"],
"biasingWeight": 1.6
}
}
Result structure
- Channel-based output:
- Results are organized per channel
- Phrase segmentation:
- Each phrase includes channel, start and end time, speaker, text, and word-level confidence
Projects API changes
New features
- Foundry project name:
- New property:
foundryProjectNamein Create, Get, Update, List APIs
- New property:
- Project creation:
- Projects are created through Azure Resource Manager (ARM) conventions
localeis now required for custom speech projects
Example:
POST {endpoint}/speechtotext/projects?api-version=2025-10-15
Headers:
Ocp-Apim-Subscription-Key: <YOUR_SUBSCRIPTION_KEY>
Content-Type: application/json
Body:
{
"locale": "en-US",
"displayName": "My speech project",
"foundryProjectName": "MyFoundrySpeechProject"
}
Project listing and filtering
- Filter by Foundry project name:
GET {endpoint}/speechtotext/projects?filter=foundryProjectName eq 'MyFoundrySpeechProject'&api-version=2025-10-15