Edit

Share via


Create and improve your custom analyzer in Content Understanding Studio

Content Understanding Studio lets you build powerful content analyzers that extract content and fields tailored to your specific needs. Follow the steps below to create your own custom analyzer in Content Understanding Studio.

Prerequisites

To get started, make sure you have the following resources and permissions:

  • An Azure subscription. If you don't have an Azure subscription, create a free account.
  • Once you have your Azure subscription, create a Microsoft Foundry resource in the Azure portal. Be sure to create it in a supported region.
    • This resource is listed under Foundry > Foundry in the portal.
  • Set up default model deployments for your Content Understanding resource. Setting defaults creates a connection to the Foundry models you use for Content Understanding requests. Choose one of the following methods:
    1. Go to the Content Understanding settings page
    2. Select the "+ Add resource" button in the upper left
    3. Select the Foundry resource that you want to use and click Next, then Save
      • Make sure to leave "Enable autodeployment for required models if no defaults are available." checked. This ensures your resource is fully set up with the required GPT-4.1, GPT-4.1-mini, and text-embedding-3-large models. Different prebuilt analyzers require different models.
    By taking these steps, you set up a connection between Content Understanding and Foundry models in your Foundry resource.

Log in to Content Understanding Studio

Go to the Content Understanding Studio portal and sign in using your credentials to get started. You might recognize the classic Azure Document Intelligence in Foundry Tools Studio experience; Content Understanding extends the same content and field extraction that you're familiar with in Document Intelligence across all modalities—document, image, video, and audio. Select the option to try out the new Content Understanding experience to get all of the multimodal capabilities of the service.

Create your custom analyzer

  1. Start with a new project: To get started with creating your custom analyzer, select Create project on the home page.

  2. Select your project type: In this guide, we will select the option to Extract content and fields with a custom schema. To learn more about classifying and routing your data, check out How to classify and route data with Content Understanding.

  3. Create your project: Give your project a friendly name and select Create.

  4. Upload sample data: Now that your project is configured, you can get started with building your custom analyzer. Upload a sample of your data to the tool, and Content Understanding will classify your data and recommend analyzer templates to give you a starting point.

Screenshot of suggested Content Understanding templates.

  1. Select a scenario template: Select a template that best fits your scenario needs. You have the option to customize all schema fields to your specific needs in the next step.

  2. Leverage suggested fields: If your scenario requires custom fields, you can leverage the AI suggestion feature to analyze your data and suggest a full schema with fields that you may be interested in extracting. The tool allows you to keep the suggestions that fit and discard the ones that don't.

Screenshot of suggested schemas using AI suggestion tool.

  1. Define your schema: Review the schema fields that were suggested or were part of the template. If there are additional fields that you want to add or change, you can utilize the edit features to refine the schema fields. Note that you can easily go back to refine your schema after testing and after you build your initial analyzer. Once you complete your changes, select Save.

  2. Test your schema: Once you feel your schema is ready for testing, select run analysis to see the output of the schema on your data. You can optionally upload additional pieces of sample data for testing to see how the schema performs.

  3. Iterate on your schema: Repeat steps 6-8 as needed to improve the output of your schema.

  4. Optional step: In-context learning (documents only): To further improve the quality of the output of your schema, you have the option to enable in-context learning. This step will enable you to bring in a knowledge base of data for the model to reference and learn from.

To get started, you will need to upload your training data to a blob storage account. Select the “Knowledge” tab and select the blob storage container containing the training dataset of sample documents. Based on the analyzer you just defined, the model will assign labels to your document. Validate that training data by reviewing and correcting any labels that have provided an incorrect output, or add any missing output.

  1. Build your analyzer: Once you’re satisfied with the output from your analyzer, select the Build analyzer button at the top of the page. Give the analyzer a name and select Build.

  2. Use your analyzer: Once your analyzer is successfully built, you can select Jump to analyzer list to view the full list of all built analyzers. Select the analyzer you just created, and you can see a code sample with a key & endpoint ready to get started. Now you have an analyzer endpoint that you can utilize in your own application via the REST API. This has been a walkthrough of how to use Content Understanding Studio to build a custom analyzer.

Next steps