Find answers to commonly asked questions about Azure Content Understanding
What is Content Understanding?
Content Understanding is a new Foundry Tool designed to generate structured insights from unstructured content using artificial intelligence. It provides consistent experience to extract content or a structured schema from audio, video, images, documents, or text inputs.
How does Content Understanding work?
Content Understanding utilizes Generative AI models to analyze and interpret various forms of unstructured content. It integrates data from different modalities (for example, text, images, audio) to generate a cohesive and structured output. The service uses machine learning models trained on diverse datasets and generative AI models to ensure high accuracy and relevance in the insights provided.
What types of unstructured content can Content Understanding process?
Content Understanding can process a wide range of unstructured content, including but not limited to:
- Audio recordings
- Video content
- Documents
- Text content
- Images
What are the key benefits of using Content Understanding?
The key benefits of using Content Understanding include:
- Confidence scores: Ensure the accuracy of extracted values while minimizing the cost of human review.
- Defined schema: Define a schema to ensure the extracted values align with intended use.
- Grounding: Trace every extracted or generated field to its source location in the document.
- In-context learning: Improve extraction quality on new templates by providing a few labeled examples without retraining.
- Quality improvements over time: The service provides capabilities to improve the quality of the schema extracted.
- Improved decision-making: Structured insights help organizations make informed decisions quickly and effectively.
- Increased efficiency: Automating the analysis of unstructured content saves time and reduces the manual effort required.
- Scalability: The service can handle large volumes of data, making it suitable for organizations of all sizes.
How can businesses use Content Understanding?
Businesses can use Content Understanding in various ways, such as:
- Automation: Automate processing of content to extract a defined schema. Call center, documents, and other similar scenarios.
- Content cataloging: managing a large corpus of digital assets.
- Customer sentiment analysis: Understanding customer feedback from reviews, social media, and support interactions.
- Market research: Analyzing trends and patterns from diverse data sources to inform business strategies.
- Operational insights: Gain insights from internal documents, emails, and other unstructured data to improve operations.
Is Content Understanding easy to integrate with existing systems?
Yes, Content Understanding easily integrates with existing systems and workflows. For example:
- Azure AI Search
- Microsoft Fabric
- Foundry Agent Service
- Azure Logic Apps
The service offers a set of easy-to-use APIs that can be integrated into any application. See code samples on GitHub.
What security measures are in place to protect data processed by Content Understanding?
Foundry Tools, including Content Understanding, adheres to strict security and compliance standards to ensure data protection. These measures include data encryption, secure access controls, and compliance with industry regulations such as GDPR and HIPAA. The service also adheres to Microsoft’s responsible use of AI.
What base models does Azure Content Understanding use?
Content Understanding uses a combination of models to process your content:
- Foundry models: You bring your own deployments of large language models (LLMs) and embeddings from Foundry. Content Understanding supports the GPT-4o and GPT-4.1 model family, and OpenAI embedding models. See the Model deployments article for the complete list of supported models.
- Other base models: Content Understanding also uses various capabilities including Speech, Vision, and Language services to support content extraction and processing across different modalities.
What are the pricing tier options for Content Understanding?
Content Understanding uses a transparent, usage-based pricing model with two main charge categories:
- Content extraction: Charges per unit of input processed (per 1,000 pages for documents, per minute for audio/video).
- Generative features: When using AI-powered features, you incur contextualization charges (fixed rate per content unit) plus token-based charges from your Microsoft Foundry model deployments (input/output tokens and embeddings).
For detailed pricing information, examples, and cost optimization tips, see the Pricing explainer and Content Understanding pricing page.
How do the face capabilities in Content Understanding differ from the Azure AI Face service?
In the GA API version (2025-11-01), Content Understanding provides face-related capabilities focused on privacy and description rather than identification:
- Face blurring: By default automatically blurs faces in video and image content to protect privacy.
- Face description: Use generative models to generate textual descriptions of faces in your content, capturing attributes, characteristics and celebrity identification.
Content Understanding doesn't include the full Azure AI Face service features such as face recognition, verification, identification, or person directory capabilities in this API version.