Edit

Share via


Use embedding models from the Microsoft Foundry model catalog for integrated vectorization

Important

This feature is in public preview under Supplemental Terms of Use. The latest preview version of Skillsets - Create Or Update (REST API) supports this feature.

In this article, you learn how to access embedding models from the Microsoft Foundry model catalog for vector conversions during indexing and query execution in Azure AI Search.

The workflow requires that you deploy a model from the catalog, which includes embedding models from Microsoft and other companies. Deploying a model is billable according to the billing structure of each provider.

After the model is deployed, you can use it with the AML skill for integrated vectorization during indexing or with the Microsoft Foundry model catalog vectorizer for queries.

Tip

Use the Import data (new) wizard to generate a skillset that includes an AML skill for deployed embedding models on Foundry. AML skill definition for inputs, outputs, and mappings are generated by the wizard, which gives you an easy way to test a model before writing any code.

Prerequisites

Supported embedding models

Supported embedding models from the model catalog vary by usage method:

Deploy an embedding model from the model catalog

  1. Follow these instructions to deploy a supported model to your project.

  2. Make a note of the target URI, key, and model name. You need these values for the vectorizer definition in a search index and for the skillset that calls the model endpoints during indexing.

    If you prefer token authentication to key-based authentication, you only need to copy the URI and model name. However, make a note of the region to which the model is deployed.

  3. Configure a search index and indexer to use the deployed model.

Sample AML skill payload

When you deploy embedding models from the model catalog, you connect to them using the AML skill in Azure AI Search for indexing workloads.

This section describes the AML skill definition and index mappings. It includes a sample payload that's already configured to work with its corresponding deployed endpoint. For more information, see Skill context and input annotation language.

Cohere embedding models

This AML skill payload works with the following embedding models:

  • Cohere-embed-v3-english
  • Cohere-embed-v3-multilingual
  • Cohere-embed-v4

It assumes that you're chunking your content using the Text Split skill and therefore your text to be vectorized is in the /document/pages/* path. If your text comes from a different path, update all references to the /document/pages/* path accordingly.

You must add the /v1/embed path onto the end of the URL that you copied from your Foundry deployment. You might also change the values for the input_type, truncate, and embedding_types inputs to better fit your use case. For more information on the available options, review the Cohere Embed API reference.

The URI and key are generated when you deploy the model from the catalog. For more information about these values, see How to deploy Cohere Embed models with Foundry.

{
  "@odata.type": "#Microsoft.Skills.Custom.AmlSkill",
  "context": "/document/pages/*",
  "uri": "https://cohere-embed-v3-multilingual-hin.eastus.models.ai.azure.com/v1/embed",
  "key": "aaaaaaaa-0b0b-1c1c-2d2d-333333333333",
  "inputs": [
    {
      "name": "texts",
      "source": "=[$(/document/pages/*)]"
    },
    {
      "name": "input_type",
      "source": "='search_document'"
    },
    {
      "name": "truncate",
      "source": "='NONE'"
    },
    {
      "name": "embedding_types",
      "source": "=['float']"
    }
  ],
  "outputs": [
    {
      "name": "embeddings",
      "targetName": "aml_vector_data"
    }
  ]
}

In addition, the output of the Cohere model isn't the embeddings array directly, but rather a JSON object that contains it. You need to select it appropriately when mapping it to the index definition via indexProjections or outputFieldMappings. Here's a sample indexProjections payload that would allow you to do implement this mapping.

If you selected a different embedding_types in your skill definition, change float in the source path to the type you selected.

"indexProjections": {
  "selectors": [
    {
      "targetIndexName": "<YOUR_TARGET_INDEX_NAME_HERE>",
      "parentKeyFieldName": "ParentKey", // Change this to the name of the field in your index definition where the parent key will be stored
      "sourceContext": "/document/pages/*",
      "mappings": [
        {
          "name": "aml_vector", // Change this to the name of the field in your index definition where the Cohere embedding will be stored
          "source": "/document/pages/*/aml_vector_data/float/0"
        }
      ]
    }
  ],
  "parameters": {}
}

Sample vectorizer payload

The Microsoft Foundry model catalog vectorizer, unlike the AML skill, is tailored to work only with embedding models that are deployable via the model catalog. The main difference is that you don't have to worry about the request and response payload. However, you must provide the modelName, which corresponds to the "Model ID" that you copied after deploying the model.

Here's a sample payload of how you would configure the vectorizer on your index definition given the properties copied from Foundry.

For Cohere models, you should NOT add the /v1/embed path to the end of your URL like you did with the skill.

"vectorizers": [
    {
        "name": "<YOUR_VECTORIZER_NAME_HERE>",
        "kind": "aml",
        "amlParameters": {
            "uri": "<YOUR_URL_HERE>",
            "key": "<YOUR_PRIMARY_KEY_HERE>",
            "modelName": "<YOUR_MODEL_ID_HERE>"
        },
    }
]

Connect using token authentication

If you can't use key-based authentication, you can configure the AML skill and Microsoft Foundry model catalog vectorizer connection for token authentication via role-based access control on Azure.

Your search service must have a system or user-assigned managed identity, and the identity must have Owner or Contributor permissions for your project. You can then remove the key field from your skill and vectorizer definition, replacing it with resourceId. If your project and search service are in different regions, also provide the region field.

"uri": "<YOUR_URL_HERE>",
"resourceId": "subscriptions/<YOUR_SUBSCRIPTION_ID_HERE>/resourceGroups/<YOUR_RESOURCE_GROUP_NAME_HERE>/providers/Microsoft.MachineLearningServices/workspaces/<YOUR_AML_WORKSPACE_NAME_HERE>/onlineendpoints/<YOUR_AML_ENDPOINT_NAME_HERE>",
"region": "westus", // Only needed if project is in different region from search service

Note

This integration doesn't currently support token authentication for Cohere models. You must use key-based authentication.