Datasets - Get
Gets the dataset identified by the given ID.
GET {endpoint}/speechtotext/datasets/{id}?api-version=2025-10-15
URI Parameters
| Name | In | Required | Type | Description |
|---|---|---|---|---|
|
endpoint
|
path | True |
string |
Supported Cognitive Services endpoints (protocol and hostname, for example: https://westus.api.cognitive.microsoft.com). |
|
id
|
path | True |
string (uuid) |
The identifier of the dataset. |
|
api-version
|
query | True |
string |
The requested api version. |
Request Header
| Name | Required | Type | Description |
|---|---|---|---|
| Ocp-Apim-Subscription-Key | True |
string |
Provide your cognitive services account key here. |
Responses
| Name | Type | Description |
|---|---|---|
| 200 OK |
OK Headers Retry-After: integer |
|
| Other Status Codes |
An error occurred. |
Security
Ocp-Apim-Subscription-Key
Provide your cognitive services account key here.
Type:
apiKey
In:
header
Examples
Get a dataset
Sample request
GET {endpoint}/speechtotext/datasets/9d5f4100-5f8e-4dd6-bd83-9bbbf50d57f1?api-version=2025-10-15
Sample response
{
"self": "https://westus.api.cognitive.microsoft.com/speechtotext/datasets/9d5f4100-5f8e-4dd6-bd83-9bbbf50d57f1?api-version=2025-10-15",
"displayName": "Acoustic dataset",
"locale": "en-US",
"createdDateTime": "2019-01-07T11:34:12Z",
"lastActionDateTime": "2019-01-07T11:36:07Z",
"kind": "Acoustic",
"links": {
"files": "https://westus.api.cognitive.microsoft.com/speechtotext/datasets/9d5f4100-5f8e-4dd6-bd83-9bbbf50d57f1/files?api-version=2025-10-15"
},
"properties": {
"acceptedLineCount": 11,
"rejectedLineCount": 2,
"durationMilliseconds": 252000,
"textNormalizationKind": "Default"
},
"contentUrl": "https://www.contoso.com/acousticdata/sourcelocation",
"status": "Succeeded"
}
Definitions
| Name | Description |
|---|---|
| Dataset |
Dataset |
|
Dataset |
DatasetKind |
|
Dataset |
DatasetLinks |
|
Dataset |
DatasetProperties |
|
Detailed |
DetailedErrorCode |
|
Entity |
EntityError |
|
Entity |
EntityReference |
| Error |
Error |
|
Error |
ErrorCode |
|
Inner |
InnerError |
| Status |
Status |
|
Text |
TextNormalizationKind |
Dataset
Dataset
| Name | Type | Description |
|---|---|---|
| contentUrl |
string (uri) |
The URL of the data for the dataset. |
| createdDateTime |
string (date-time) |
The time-stamp when the object was created. The time stamp is encoded as ISO 8601 date and time format ("YYYY-MM-DDThh:mm:ssZ", see https://en.wikipedia.org/wiki/ISO_8601#Combined_date_and_time_representations). |
| customProperties |
object |
The custom properties of this entity. The maximum allowed key length is 64 characters, the maximum allowed value length is 256 characters and the count of allowed entries is 10. |
| description |
string |
The description of the object. |
| displayName |
string minLength: 1 |
The display name of the object. |
| kind |
DatasetKind |
|
| lastActionDateTime |
string (date-time) |
The time-stamp when the current status was entered. The time stamp is encoded as ISO 8601 date and time format ("YYYY-MM-DDThh:mm:ssZ", see https://en.wikipedia.org/wiki/ISO_8601#Combined_date_and_time_representations). |
| links |
DatasetLinks |
|
| locale |
string minLength: 1 |
The locale of the contained data. |
| project |
EntityReference |
|
| properties |
DatasetProperties |
|
| self |
string (uri) |
The location of this entity. |
| status |
Status |
DatasetKind
DatasetKind
| Value | Description |
|---|---|
| Language |
A language dataset. |
| Acoustic |
An acoustic dataset. |
| Pronunciation |
A pronunciation dataset. |
| AudioFiles |
An audio files dataset. |
| LanguageMarkdown |
A language markdown dataset. |
| OutputFormatting |
Dataset that contains rules to customize inverse text normalization, capitalization, reformulation, profanity and also defines tests for dataset validation |
DatasetLinks
DatasetLinks
| Name | Type | Description |
|---|---|---|
| commitBlocks |
string (uri) |
The location to commit the list of blocks when uploading a dataset using blocks. See operation "Datasets_CommitBlocks" for more details. |
| files |
string (uri) |
The location to get all files of this entity. See operation "Datasets_ListFiles" for more details. |
| listBlocks |
string (uri) |
The location to list the already uploaded blocks of this entity when uploading a dataset using blocks. See operation "Datasets_GetBlocks" for more details. |
| uploadBlocks |
string (uri) |
The location to upload blocks to when uploading a dataset using blocks. See operation "Datasets_UploadBlock" for more details. |
DatasetProperties
DatasetProperties
| Name | Type | Default value | Description |
|---|---|---|---|
| acceptedLineCount |
integer (int32) |
The number of lines accepted for this data set. |
|
| durationMilliseconds |
integer (int64) |
0 |
The total duration in milliseconds of the datasets if it contains audio files. Durations larger than 2^53-1 are not supported to ensure compatibility with JavaScript integers. |
| error |
EntityError |
||
| rejectedLineCount |
integer (int32) |
The number of lines rejected for this data set. |
|
| textNormalizationKind |
TextNormalizationKind |
DetailedErrorCode
DetailedErrorCode
| Value | Description |
|---|---|
| InvalidParameterValue |
Invalid parameter value. |
| InvalidRequestBodyFormat |
Invalid request body format. |
| EmptyRequest |
Empty Request. |
| MissingInputRecords |
Missing Input Records. |
| InvalidDocument |
Invalid Document. |
| ModelVersionIncorrect |
Model Version Incorrect. |
| InvalidDocumentBatch |
Invalid Document Batch. |
| UnsupportedLanguageCode |
Unsupported language code. |
| DataImportFailed |
Data import failed. |
| InUseViolation |
In use violation. |
| InvalidLocale |
Invalid locale. |
| InvalidBaseModel |
Invalid base model. |
| InvalidAdaptationMapping |
Invalid adaptation mapping. |
| InvalidDataset |
Invalid dataset. |
| InvalidTest |
Invalid test. |
| FailedDataset |
Failed dataset. |
| InvalidModel |
Invalid model. |
| InvalidTranscription |
Invalid transcription. |
| InvalidPayload |
Invalid payload. |
| InvalidParameter |
Invalid parameter. |
| EndpointWithoutLogging |
Endpoint without logging. |
| InvalidPermissions |
Invalid permissions. |
| InvalidPrerequisite |
Invalid prerequisite. |
| InvalidProductId |
Invalid product id. |
| InvalidSubscription |
Invalid subscription. |
| InvalidProject |
Invalid project. |
| InvalidProjectKind |
Invalid project kind. |
| InvalidRecordingsUri |
Invalid recordings uri. |
| OnlyOneOfUrlsOrContainerOrDataset |
Only one of urls or container or dataset. |
| ExceededNumberOfRecordingsUris |
Exceeded number of recordings uris. |
| InvalidChannels |
Invalid channels. |
| ModelMismatch |
Model mismatch. |
| ProjectGenderMismatch |
Project gender mismatch. |
| ModelDeprecated |
Model deprecated. |
| ModelExists |
Model exists. |
| ModelNotDeployable |
Model not deployable. |
| EndpointNotUpdatable |
Endpoint not updatable. |
| SingleDefaultEndpoint |
Single default endpoint. |
| EndpointCannotBeDefault |
Endpoint cannot be default. |
| InvalidModelUri |
Invalid model uri. |
| SubscriptionNotFound |
Subscription not found. |
| QuotaViolation |
Quota violation. |
| UnsupportedDelta |
Unsupported delta. |
| UnsupportedFilter |
Unsupported filter. |
| UnsupportedPagination |
Unsupported pagination. |
| UnsupportedDynamicConfiguration |
Unsupported dynamic configuration. |
| UnsupportedOrderBy |
Unsupported order by. |
| NoUtf8WithBom |
No utf8 with bom. |
| ModelDeploymentNotCompleteState |
Model deployment not complete state. |
| SkuLimitsExist |
Sku limits exist. |
| DeployingFailedModel |
Deploying failed model. |
| UnsupportedTimeRange |
Unsupported time range. |
| InvalidLogDate |
Invalid log date. |
| InvalidLogId |
Invalid log id. |
| InvalidLogStartTime |
Invalid log start time. |
| InvalidLogEndTime |
Invalid log end time. |
| InvalidTopForLogs |
Invalid top for logs. |
| InvalidSkipTokenForLogs |
Invalid skip token for logs. |
| DeleteNotAllowed |
Delete not allowed. |
| Forbidden |
Forbidden. |
| DeployNotAllowed |
Deploy not allowed. |
| UnexpectedError |
Unexpected error. |
| InvalidCollection |
Invalid collection. |
| InvalidCallbackUri |
Invalid callback uri. |
| InvalidSasValidityDuration |
Invalid sas validity duration. |
| InaccessibleCustomerStorage |
Inaccessible customer storage. |
| UnsupportedClassBasedAdaptation |
Unsupported class based adaptation. |
| InvalidWebHookEventKind |
Invalid web hook event kind. |
| InvalidTimeToLive |
Invalid time to live. |
| InvalidSourceAzureResourceId |
Invalid source Azure resource ID. |
| ModelCopyAuthorizationExpired |
Expired ModelCopyAuthorization. |
| EndpointLoggingNotSupported |
Endpoint logging not supported. |
| NoLanguageIdentified |
Language Identification did not recognize any language. |
| MultipleLanguagesIdentified |
Language Identification recognized multiple languages. No dominant language could be determined. |
| InvalidAudioFormat |
The format of input audio is not supported. |
| BadChannelConfiguration |
There is a mismatch between audio channels in the data, in the configuration, or the requirements of the application. |
| InvalidChannelSpecification |
The selection of channels in the transcription request is not supported (e.g., neither 0 nor 1 have been selected.) |
| AudioLengthLimitExceeded |
The audio file is longer than the maximum allowed duration. |
| EmptyAudioFile |
The audio file is empty. |
EntityError
EntityError
| Name | Type | Description |
|---|---|---|
| code |
string |
The code of this error. |
| message |
string |
The message for this error. |
EntityReference
EntityReference
| Name | Type | Description |
|---|---|---|
| self |
string (uri) |
The location of the referenced entity. |
Error
Error
| Name | Type | Description |
|---|---|---|
| code |
ErrorCode |
|
| details |
Error[] |
Additional supportive details regarding the error and/or expected policies. |
| innerError |
InnerError |
|
| message |
string |
High level error message. |
| target |
string |
The source of the error. For example it would be "documents" or "document id" in case of invalid document. |
ErrorCode
ErrorCode
| Value | Description |
|---|---|
| InvalidRequest |
Representing the invalid request error code. |
| InvalidArgument |
Representing the invalid argument error code. |
| InternalServerError |
Representing the internal server error error code. |
| ServiceUnavailable |
Representing the service unavailable error code. |
| NotFound |
Representing the not found error code. |
| PipelineError |
Representing the pipeline error error code. |
| Conflict |
Representing the conflict error code. |
| InternalCommunicationFailed |
Representing the internal communication failed error code. |
| Forbidden |
Representing the forbidden error code. |
| NotAllowed |
Representing the not allowed error code. |
| Unauthorized |
Representing the unauthorized error code. |
| UnsupportedMediaType |
Representing the unsupported media type error code. |
| TooManyRequests |
Representing the too many requests error code. |
| UnprocessableEntity |
Representing the unprocessable entity error code. |
InnerError
InnerError
| Name | Type | Description |
|---|---|---|
| code |
DetailedErrorCode |
|
| details |
object |
Additional supportive details regarding the error and/or expected policies. |
| innerError |
InnerError |
|
| message |
string |
High level error message. |
| target |
string |
The source of the error. For example it would be "documents" or "document id" in case of invalid document. |
Status
Status
| Value | Description |
|---|---|
| NotStarted |
The long running operation has not yet started. |
| Running |
The long running operation is currently processing. |
| Succeeded |
The long running operation has successfully completed. |
| Failed |
The long running operation has failed. |
TextNormalizationKind
TextNormalizationKind
| Value | Description |
|---|---|
| Default |
Default text normalization (e.g. '2 to 3' is replaced by 'two to three' in en-US). |
| None |
No text normalization will be applied to the input text. This is an override option that should only be used when text is normalized before the upload. |