AML skill

Important

Support for indexer connections to the model catalog is in public preview under supplemental terms of use. Preview REST APIs support this capability.

Use the AML skill to extend AI enrichment with a deployed base embedding model from the Microsoft Foundry model catalog or a custom Azure Machine Learning (AML) model. Your data is processed in the Geo where your model is deployed.

You specify the AML skill in a skillset, which then integrates your deployed model into an AI enrichment pipeline. The AML skill is useful for performing processing or inference not supported by built-in skills. Examples include generating embeddings with your own model and applying custom machine learning logic to enriched content.

For AML online endpoints, use a stable API version or an equivalent Azure SDK to call the AML skill. For connections to the model catalog, use a preview API version.

AML skill usage

Like other skills, the AML skill has inputs and outputs. The inputs are sent as a JSON object to a serverless deployment from the Foundry model catalog or an AML online endpoint. The output should include a success status code, JSON payload, and the parameters specified by your AML skill definition. Any other response is considered an error, and no enrichments are performed.

The indexer retries two times for the following HTTP status codes:

503 Service Unavailable
429 Too Many Requests

AML skill for models in Foundry

Azure AI Search provides the Microsoft Foundry model catalog vectorizer, which is also available in the Import data (new) wizard, for query-time connections to the model catalog. If you want to use this vectorizer for queries, the AML skill is the indexing counterpart for generating embeddings using a model from the model catalog.

During indexing, the AML skill can connect to the model catalog to generate vectors for the index. At query time, queries can use a vectorizer to connect to the same model to vectorize text strings. You should use the AML skill and the Microsoft Foundry model catalog vectorizer together so that the same embedding model is used for indexing and queries. For more information, see Use embedding models from the Foundry model catalog.

We recommend using the Import data (new) wizard to generate a skillset that includes an AML skill for deployed embedding models in Foundry. The wizard generates the AML skill definition for inputs, outputs, and mappings, providing an easy way to test a model before writing any code.

Prerequisites

A Foundry hub-based project or an AML workspace for a custom model that you create.
For hub-based projects only, a serverless deployment of a supported model from the Foundry model catalog.

@odata.type

Microsoft.Skills.Custom.AmlSkill

Skill parameters

Parameters are case sensitive. The parameters you use depend on what authentication your model provider requires, if any.

Parameter name	Description
`uri`	(Required for key authentication) The target URI of the serverless deployment from the Foundry model catalog or the scoring URI of the AML online endpoint. Only the HTTPS URI scheme is allowed. Supported models from the model catalog are: Cohere-embed-v3-english Cohere-embed-v3-multilingual Cohere-embed-v4
`key`	(Required for key authentication) The API key of the model provider.
`resourceId`	(Required for token authentication) The Azure Resource Manager resource ID of the model provider. For an AML online endpoint, use the `subscriptions/{guid}/resourceGroups/{resource-group-name}/Microsoft.MachineLearningServices/workspaces/{workspace-name}/onlineendpoints/{endpoint_name}` format.
`region`	(Optional for token authentication) The region in which the model provider is deployed. Required if the region is different from the region of the search service.
`timeout`	(Optional) The timeout for the HTTP client making the API call. It must be formatted as an XSD "dayTimeDuration" value, which is a restricted subset of an ISO 8601 duration value. For example, `PT60S` for 60 seconds. If not set, a default value of 30 seconds is chosen. You can set the timeout to a minimum of 1 second and a maximum of 230 seconds.
`degreeOfParallelism`	(Optional) The number of calls the indexer makes in parallel to the endpoint you provide. You can decrease this value if your endpoint is failing under too high of a request load. You can raise it if your endpoint is able to accept more requests and you would like an increase in the performance of the indexer. If not set, a default value of 5 is used. You can set `degreeOfParallelism` to a minimum of 1 and a maximum of 10.

Authentication

The AML skill provides two authentication options:

Key-based authentication. You provide a static key to authenticate scoring requests from the AML skill. Set the uri and key parameters for this connection.
Token-based authentication. The Foundry hub-based project or AML online endpoint is deployed using token-based authentication. The Azure AI Search service must have a managed identity and a role assignment on the model provider. The AML skill then uses the search service identity to authenticate against the model provider, with no static keys required. The search service identity must have the Owner or Contributor role. Set the resourceId parameter, and if the search service is in a different region from the model provider, set the region parameter.

Skill inputs

Skill inputs are a node of the enriched document created during document cracking. For example, it might be the root document, a normalized image, or the content of a blob. There are no predefined inputs for this skill. For inputs, you should specify one or more nodes that are populated at the time of the AML skill's execution.

Skill outputs

Skill outputs are new nodes of an enriched document created by the skill. There are no predefined outputs for this skill. For outputs, you should provide nodes that can be populated from the JSON response of your AML skill.

Sample definition

  {
    "@odata.type": "#Microsoft.Skills.Custom.AmlSkill",
    "description": "A custom model that detects the language in a document.",
    "uri": "https://language-model.models.contoso.com/score",
    "context": "/document",
    "inputs": [
      {
        "name": "text",
        "source": "/document/content"
      }
    ],
    "outputs": [
      {
        "name": "detected_language_code"
      }
    ]
  }

Sample input JSON structure

This JSON structure represents the payload sent to your Foundry hub-based project or AML online endpoint. The top-level fields of the structure correspond to the "names" specified in the inputs section of the skill definition. The values of those fields are from the "sources" of those fields, which could be from a field in the document or another skill.

{
  "text": "Este es un contrato en Inglés"
}

Sample output JSON structure

The output corresponds to the response from your Foundry hub-based project or AML online endpoint. The model provider should only return a JSON payload (verified by looking at the Content-Type response header) and should be an object whose fields are enrichments matching the "names" in the output and whose value is considered the enrichment.

{
    "detected_language_code": "es"
}

Inline shaping sample definition

  {
    "@odata.type": "#Microsoft.Skills.Custom.AmlSkill",
    "description": "A sample model that detects the language of sentence",
    "uri": "https://language-model.models.contoso.com/score",
    "context": "/document",
    "inputs": [
      {
        "name": "shapedText",
        "sourceContext": "/document",
        "inputs": [
            {
              "name": "content",
              "source": "/document/content"
            }
        ]
      }
    ],
    "outputs": [
      {
        "name": "detected_language_code"
      }
    ]
  }

Inline shaping input JSON structure

{
  "shapedText": { "content": "Este es un contrato en Inglés" }
}

Inline shaping sample output JSON structure

{
    "detected_language_code": "es"
}

Error cases

In addition to your Foundry hub-based project or AML online endpoint being unavailable or sending nonsuccessful status codes, the following cases are considered errors:

The model provider returns a success status code, but the response indicates that it isn't application/json. The response is thus invalid, and no enrichments are performed.
The model provider returns invalid JSON.

If the model provider is unavailable or returns an HTTP error, a friendly error with any available details about the HTTP error is added to the indexer execution history.

Feedback

Was this page helpful?

Last updated on 2025-11-18