チュートリアル: 生成 AI を使用して画像を言語化する

Azure AI Search では、Azure Blob Storage に格納されている PDF ドキュメントからテキストと画像の両方を抽出し、インデックスを作成できます。このチュートリアルでは、 組み込みの Text Split スキルを使用してデータをチャンク し、 画像の言語化 を使用して画像を記述するマルチモーダルインデックス作成パイプラインを構築する方法について説明します。トリミングされた画像はナレッジストアに格納され、ビジュアルコンテンツは自然言語で記述され、検索可能なインデックス内のテキストと共に取り込まれます。

画像の言語化を取得するために、抽出された各画像は、チャット完了モデルを呼び出して簡潔なテキスト説明を生成する GenAI Prompt スキル (プレビュー) に渡されます。これらの説明は、元のドキュメントテキストと共に、Azure OpenAI の text-embedding-3-large モデルを使用してベクター表現に埋め込まれます。結果は、テキストと言語化された画像の両方のモダリティから検索可能なコンテンツを含む単一のインデックスになります。

このチュートリアルでは、次を使用します。

グラフ、インフォグラフィック、スキャンしたページなどの豊富なビジュアルコンテンツと従来のテキストを組み合わせた 36 ページの PDF ドキュメント。
スキルを通じた AI エンリッチメントを含むインデックス作成パイプラインを作成するためのインデクサーとスキルセット。
正規化された画像とテキストを抽出するためのドキュメント抽出スキル。テキスト分割スキルは、データをチャンクします。
チャット完了モデルを呼び出して視覚的なコンテンツの説明を作成する GenAI プロンプトスキル (プレビュー)。
テキストと画像の言語化を格納するように構成された検索インデックス。一部のコンテンツは、ベクターベースの類似性検索のためにベクター化されます。

このチュートリアルでは、ドキュメント抽出スキルと画像キャプションを使用してマルチモーダルコンテンツのインデックスを作成するための低コストのアプローチを示します。これにより、Azure Blob Storage 内のドキュメントからテキストと画像の両方を抽出および検索できます。ただし、ページ番号や境界領域など、テキストの場所メタデータは含まれません。構造化テキストレイアウトと空間メタデータを含むより包括的なソリューションについては、「チュートリアル: 構造化ドキュメントレイアウトから画像を言語化する」を参照してください。

Note

ドキュメント抽出スキルによる画像抽出は無料ではありません。スキルセットで imageAction を generateNormalizedImages に設定すると、画像の抽出がトリガーされます。これは追加料金です。課金情報については、 Azure AI Search の価格に関するページを参照してください。

Prerequisites

Azure AI 検索. ロールベースのアクセス制御とマネージド ID 用に検索サービスを構成します。サービスは Basic レベル以上である必要があります。このチュートリアルは Free レベルではサポートされていません。
Azure Storage。サンプルデータの格納とナレッジストアの作成に使用されます。
以下のデプロイを含む Azure OpenAI
- Microsoft Foundry または別のソースでホストされているチャット完了モデル。モデルは、画像コンテンツを言語化するために使用されます。 GenAI プロンプトスキル定義で、ホストされているモデルに URI を指定します。任意のチャット完了モデルを使用できます。
- Foundry にデプロイされたテキスト埋め込みモデル。このモデルは、ソースドキュメントからプルされたテキストコンテンツと、チャット完了モデルによって生成された画像の説明をベクター化するために使用されます。統合ベクター化の場合、埋め込みモデルは Foundry にあり、text-embedding-ada-002、text-embedding-3-large、または text-embedding-3-small である必要があります。外部埋め込みモデルを使用する場合は、Azure OpenAI 埋め込みスキルの代わりにカスタムスキルを使用します。
Visual Studio CodeとREST クライアント。

データを準備する

次の手順は、サンプルデータを提供し、ナレッジストアもホストする Azure Storage に適用されます。検索サービス ID では、サンプルデータを取得するために Azure Storage への読み取りアクセス権が必要であり、ナレッジストアを作成するには書き込みアクセス権が必要です。検索サービスは、環境変数に指定した名前を使用して、スキルセットの処理中にトリミングされたイメージのコンテナーを作成します。

次のサンプル PDF をダウンロードします。sustainable-ai-pdf
Azure Storage で、 sustainable-ai-pdf という名前の新しいコンテナーを作成します。
サンプルデータファイルをアップロードします。
ロールの割り当てを作成し、接続文字列でマネージド ID を指定します。
1. インデクサーによるデータ取得のために ストレージ BLOB データリーダー を割り当てます。ナレッジストアを作成して読み込むには、 ストレージ BLOB データ共同作成者 と ストレージテーブルデータ共同作成者 を割り当てます。検索サービスロールの割り当てには、システム割り当てマネージド ID またはユーザー割り当てマネージド ID を使用できます。
2. システム割り当てマネージド ID を使用して行われた接続の場合は、アカウントキーまたはパスワードなしで ResourceId を含む接続文字列を取得します。 ResourceId には、ストレージアカウントのサブスクリプション ID、ストレージアカウントのリソースグループ、およびストレージアカウント名を含める必要があります。接続文字列は、次の例のような URL です:
```
"credentials" : { 
    "connectionString" : "ResourceId=/subscriptions/00000000-0000-0000-0000-00000000/resourceGroups/MY-DEMO-RESOURCE-GROUP/providers/Microsoft.Storage/storageAccounts/MY-DEMO-STORAGE-ACCOUNT/;" 
}
```
3. ユーザー割り当てマネージド ID を使用して行われた接続の場合は、アカウントキーまたはパスワードなしで ResourceId を含む接続文字列を取得します。 ResourceId には、ストレージアカウントのサブスクリプション ID、ストレージアカウントのリソースグループ、およびストレージアカウント名を含める必要があります。次の例に示す構文を使用して ID を指定します。 userAssignedIdentity をユーザー割り当てマネージド ID に設定します。接続文字列は、次の例のような URL です:
```
"credentials" : { 
    "connectionString" : "ResourceId=/subscriptions/00000000-0000-0000-0000-00000000/resourceGroups/MY-DEMO-RESOURCE-GROUP/providers/Microsoft.Storage/storageAccounts/MY-DEMO-STORAGE-ACCOUNT/;" 
},
"identity" : { 
    "@odata.type": "#Microsoft.Azure.Search.DataUserAssignedIdentity",
    "userAssignedIdentity" : "/subscriptions/00000000-0000-0000-0000-00000000/resourcegroups/MY-DEMO-RESOURCE-GROUP/providers/Microsoft.ManagedIdentity/userAssignedIdentities/MY-DEMO-USER-MANAGED-IDENTITY" 
}
```

モデルを準備する

このチュートリアルでは、スキルがテキスト埋め込みモデルとチャット完了モデルを呼び出す既存の Azure OpenAI リソースがあることを前提としています。検索サービスは、スキルセットの処理中と、そのマネージド ID を使用したクエリの実行中にモデルに接続します。このセクションでは、承認されたアクセスのロールを割り当てるためのガイダンスとリンクを示します。

(Foundry ポータルではなく) Azure portal にサインインし、Azure OpenAI リソースを見つけます。
[アクセス制御 (IAM)] を選択します。
[ 追加] を選択し、[ ロールの割り当ての追加] を選択します。
Cognitive Services OpenAI ユーザーを検索し、それを選択します。
[ マネージド ID] を 選択し、検索サービスのマネージド ID を割り当てます。

詳細については、「 Foundry モデルでの Azure OpenAI のロールベースのアクセス制御」を参照してください。

REST ファイルを設定する

このチュートリアルでは、Azure AI Search へのローカル REST クライアント接続にエンドポイントと API キーが必要です。これらの値は Azure portal から取得できます。別の接続方法については、「検索サービスへの接続」を参照してください。

インデクサーとスキルセットの処理中に発生する認証済み接続の場合、検索サービスでは、前に定義したロールの割り当てが使用されます。

Visual Studio Code を起動して、新しいファイルを作成します。

要求で使用される変数の値を指定します。 @storageConnectionの場合は、接続文字列に末尾のセミコロンまたは引用符がないことを確認します。 @imageProjectionContainerの場合は、BLOB ストレージで一意のコンテナー名を指定します。 Azure AI Search では、スキルの処理中にこのコンテナーが自動的に作成されます。

@searchUrl = PUT-YOUR-SEARCH-SERVICE-ENDPOINT-HERE
@searchApiKey = PUT-YOUR-ADMIN-API-KEY-HERE
@storageConnection = PUT-YOUR-STORAGE-CONNECTION-STRING-HERE
@openAIResourceUri = PUT-YOUR-OPENAI-URI-HERE
@openAIKey = PUT-YOUR-OPENAI-KEY-HERE
@chatCompletionResourceUri = PUT-YOUR-CHAT-COMPLETION-URI-HERE
@chatCompletionKey = PUT-YOUR-CHAT-COMPLETION-KEY-HERE
@imageProjectionContainer=sustainable-ai-pdf-images

ファイル拡張子 .rest または .http を使用してファイルを保存します。 REST クライアントのヘルプについては、「クイックスタート: REST を使用したフルテキスト検索」を参照してください。

Azure AI Search エンドポイントと API キーを取得するには:

Azure portal にサインインし、検索サービスの [概要] ページに移動して URL をコピーします。たとえば、エンドポイントは https://mydemo.search.windows.net のようになります。
[設定]>[キー] で管理者キーをコピーします。管理者キーは、オブジェクトの追加、変更、削除で使用します。 2 つの交換可能な管理者キーがあります。どちらかをコピーします。

データソースを作成する

データソースの作成 (REST) では、インデックスを付けるデータを指定するデータソース接続を作成します。

### Create a data source
POST {{searchUrl}}/datasources?api-version=2025-11-01-preview   HTTP/1.1
  Content-Type: application/json
  api-key: {{searchApiKey}}

  {
    "name": "doc-extraction-image-verbalization-ds",
    "description": null,
    "type": "azureblob",
    "subtype": null,
    "credentials":{
      "connectionString":"{{storageConnection}}"
    },
    "container": {
      "name": "sustainable-ai-pdf",
      "query": null
    },
    "dataChangeDetectionPolicy": null,
    "dataDeletionDetectionPolicy": null,
    "encryptionKey": null,
    "identity": null
  }

要求を送信します。応答は次のようになります。

HTTP/1.1 201 Created
Transfer-Encoding: chunked
Content-Type: application/json; odata.metadata=minimal; odata.streaming=true; charset=utf-8
Location: https://<YOUR-SEARCH-SERVICE-NAME>.search.windows-int.net:443/datasources('doc-extraction-multimodal-embedding-ds')?api-version=2025-11-01-preview -Preview
Server: Microsoft-IIS/10.0
Strict-Transport-Security: max-age=2592000, max-age=15724800; includeSubDomains
Preference-Applied: odata.include-annotations="*"
OData-Version: 4.0
request-id: 4eb8bcc3-27b5-44af-834e-295ed078e8ed
elapsed-time: 346
Date: Sat, 26 Apr 2025 21:25:24 GMT
Connection: close

{
  "name": "doc-extraction-multimodal-embedding-ds",
  "description": null,
  "type": "azureblob",
  "subtype": null,
  "indexerPermissionOptions": [],
  "credentials": {
    "connectionString": null
  },
  "container": {
    "name": "sustainable-ai-pdf",
    "query": null
  },
  "dataChangeDetectionPolicy": null,
  "dataDeletionDetectionPolicy": null,
  "encryptionKey": null,
  "identity": null
}

インデックスを作成する

インデックスの作成 (REST) では、検索サービスに検索インデックスを作成します。インデックスでは、すべてのパラメーターとその属性を指定します。

入れ子になった JSON の場合、インデックスフィールドはソースフィールドと同じである必要があります。現在、Azure AI Search では入れ子になった JSON へのフィールドマッピングはサポートされていないため、フィールド名とデータ型は完全に一致する必要があります。次のインデックスは、生コンテンツの JSON 要素に合わせて配置されます。

### Create an index
POST {{searchUrl}}/indexes?api-version=2025-11-01-preview   HTTP/1.1
  Content-Type: application/json
  api-key: {{searchApiKey}}

{
    "name": "doc-extraction-image-verbalization-index",
    "fields": [
        {
            "name": "content_id",
            "type": "Edm.String",
            "retrievable": true,
            "key": true,
            "analyzer": "keyword"
        },
        {
            "name": "text_document_id",
            "type": "Edm.String",
            "searchable": false,
            "filterable": true,
            "retrievable": true,
            "stored": true,
            "sortable": false,
            "facetable": false
        },          
        {
            "name": "document_title",
            "type": "Edm.String",
            "searchable": true
        },
        {
            "name": "image_document_id",
            "type": "Edm.String",
            "filterable": true,
            "retrievable": true
        },
        {
            "name": "content_text",
            "type": "Edm.String",
            "searchable": true,
            "retrievable": true
        },
        {
            "name": "content_embedding",
            "type": "Collection(Edm.Single)",
            "dimensions": 3072,
            "searchable": true,
            "retrievable": true,
            "vectorSearchProfile": "hnsw"
        },
        {
            "name": "content_path",
            "type": "Edm.String",
            "searchable": false,
            "retrievable": true
        },
        {
            "name": "offset",
            "type": "Edm.String",
            "searchable": false,
            "retrievable": true
        },
        {
            "name": "location_metadata",
            "type": "Edm.ComplexType",
            "fields": [
                {
                "name": "page_number",
                "type": "Edm.Int32",
                "searchable": false,
                "retrievable": true
                },
                {
                "name": "bounding_polygons",
                "type": "Edm.String",
                "searchable": false,
                "retrievable": true,
                "filterable": false,
                "sortable": false,
                "facetable": false
                }
            ]
        }         
    ],
    "vectorSearch": {
        "profiles": [
            {
                "name": "hnsw",
                "algorithm": "defaulthnsw",
                "vectorizer": "demo-vectorizer"
            }
        ],
        "algorithms": [
            {
                "name": "defaulthnsw",
                "kind": "hnsw",
                "hnswParameters": {
                    "m": 4,
                    "efConstruction": 400,
                    "metric": "cosine"
                }
            }
        ],
        "vectorizers": [
            {
              "name": "demo-vectorizer",
              "kind": "azureOpenAI",    
              "azureOpenAIParameters": {
                "resourceUri": "{{openAIResourceUri}}",
                "deploymentId": "text-embedding-3-large",
                "searchApiKey": "{{openAIKey}}",
                "modelName": "text-embedding-3-large"
              }
            }
        ]
    },
    "semantic": {
        "defaultConfiguration": "semanticconfig",
        "configurations": [
            {
                "name": "semanticconfig",
                "prioritizedFields": {
                    "titleField": {
                        "fieldName": "document_title"
                    },
                    "prioritizedContentFields": [
                    ],
                    "prioritizedKeywordsFields": []
                }
            }
        ]
    }
}

重要なポイント:

テキストと画像の埋め込みは、 content_embedding フィールドに格納され、適切なディメンション (3072 など) とベクター検索プロファイルを使用して構成する必要があります。
location_metadata は、正規化された各画像の境界ポリゴンとページ番号のメタデータをキャプチャし、正確な空間検索または UI オーバーレイを可能にします。 location_metadata このシナリオでは、イメージに対してのみ存在します。テキストの場所メタデータもキャプチャする場合は、 Document Layout スキルの使用を検討してください。詳細なチュートリアルは、ページの下部にリンクされています。
ベクター検索の詳細については、「 Azure AI Search のベクター」を参照してください。
セマンティックランク付けの詳細については、「Azure AI Search でのセマンティックランク付け」を参照してください。

スキルセットを作成する

スキルセット (REST) を作成すると、検索サービスにスキルセットが作成されます。スキルセットは、インデックス作成の前にコンテンツをチャンクして埋め込む操作を定義します。このスキルセットでは、組み込みのドキュメント抽出スキルを使用してテキストと画像を抽出します。テキスト分割スキルを使用して、大きなテキストを分割します。 Azure OpenAI Embedding スキルを使用してテキストコンテンツをベクター化します。

スキルセットは、画像に固有のアクションも実行します。 GenAI プロンプトスキルを使用して画像の説明を生成します。また、そのままの画像を格納するナレッジストアも作成されるため、クエリで返すことができます。

### Create a skillset
POST {{searchUrl}}/skillsets?api-version=2025-11-01-preview   HTTP/1.1
  Content-Type: application/json
  api-key: {{searchApiKey}}

{
  "name": "doc-extraction-image-verbalization-skillset",
  "description": "A test skillset",
  "skills": [
    {
      "@odata.type": "#Microsoft.Skills.Util.DocumentExtractionSkill",
      "name": "document-extraction-skill",
      "description": "Document extraction skill to extract text and images from documents",
      "parsingMode": "default",
      "dataToExtract": "contentAndMetadata",
      "configuration": {
          "imageAction": "generateNormalizedImages",
          "normalizedImageMaxWidth": 2000,
          "normalizedImageMaxHeight": 2000
      },
      "context": "/document",
      "inputs": [
        {
          "name": "file_data",
          "source": "/document/file_data"
        }
      ],
      "outputs": [
        {
          "name": "content",
          "targetName": "extracted_content"
        },
        {
          "name": "normalized_images",
          "targetName": "normalized_images"
        }
      ]
    },
    {
      "@odata.type": "#Microsoft.Skills.Text.SplitSkill",
      "name": "split-skill",
      "description": "Split skill to chunk documents",
      "context": "/document",
      "defaultLanguageCode": "en",
      "textSplitMode": "pages",
      "maximumPageLength": 2000,
      "pageOverlapLength": 200,
      "unit": "characters",
      "inputs": [
        {
          "name": "text",
          "source": "/document/extracted_content",
          "inputs": []
        }
      ],
      "outputs": [
        {
          "name": "textItems",
          "targetName": "pages"
        }
      ]
    }, 
    {
    "@odata.type": "#Microsoft.Skills.Text.AzureOpenAIEmbeddingSkill",
    "name": "text-embedding-skill",
    "description": "Embedding skill for text",
    "context": "/document/pages/*",
    "inputs": [
        {
        "name": "text",
        "source": "/document/pages/*"
        }
    ],
    "outputs": [
        {
        "name": "embedding",
        "targetName": "text_vector"
        }
    ],
    "resourceUri": "{{openAIResourceUri}}",
    "deploymentId": "text-embedding-3-large",
    "searchApiKey": "{{openAIKey}}",
    "dimensions": 3072,
    "modelName": "text-embedding-3-large"
    },
    {
    "@odata.type": "#Microsoft.Skills.Custom.ChatCompletionSkill",
    "name": "genAI-prompt-skill",
    "description": "GenAI Prompt skill for image verbalization",
    "uri": "{{chatCompletionResourceUri}}",
    "timeout": "PT1M",
    "searchApiKey": "{{chatCompletionKey}}",
    "context": "/document/normalized_images/*",
    "inputs": [
        {
        "name": "systemMessage",
        "source": "='You are tasked with generating concise, accurate descriptions of images, figures, diagrams, or charts in documents. The goal is to capture the key information and meaning conveyed by the image without including extraneous details like style, colors, visual aesthetics, or size.\n\nInstructions:\nContent Focus: Describe the core content and relationships depicted in the image.\n\nFor diagrams, specify the main elements and how they are connected or interact.\nFor charts, highlight key data points, trends, comparisons, or conclusions.\nFor figures or technical illustrations, identify the components and their significance.\nClarity & Precision: Use concise language to ensure clarity and technical accuracy. Avoid subjective or interpretive statements.\n\nAvoid Visual Descriptors: Exclude details about:\n\nColors, shading, and visual styles.\nImage size, layout, or decorative elements.\nFonts, borders, and stylistic embellishments.\nContext: If relevant, relate the image to the broader content of the technical document or the topic it supports.\n\nExample Descriptions:\nDiagram: \"A flowchart showing the four stages of a machine learning pipeline: data collection, preprocessing, model training, and evaluation, with arrows indicating the sequential flow of tasks.\"\n\nChart: \"A bar chart comparing the performance of four algorithms on three datasets, showing that Algorithm A consistently outperforms the others on Dataset 1.\"\n\nFigure: \"A labeled diagram illustrating the components of a transformer model, including the encoder, decoder, self-attention mechanism, and feedforward layers.\"'"
        },
        {
        "name": "userMessage",
        "source": "='Please describe this image.'"
        },
        {
        "name": "image",
        "source": "/document/normalized_images/*/data"
        }
        ],
        "outputs": [
            {
            "name": "response",
            "targetName": "verbalizedImage"
            }
        ]
    },    
    {
    "@odata.type": "#Microsoft.Skills.Text.AzureOpenAIEmbeddingSkill",
    "name": "verbalized-image-embedding-skill",
    "description": "Embedding skill for verbalized images",
    "context": "/document/normalized_images/*",
    "inputs": [
        {
        "name": "text",
        "source": "/document/normalized_images/*/verbalizedImage",
        "inputs": []
        }
    ],
    "outputs": [
        {
        "name": "embedding",
        "targetName": "verbalizedImage_vector"
        }
    ],
    "resourceUri": "{{openAIResourceUri}}",
    "deploymentId": "text-embedding-3-large",
    "searchApiKey": "{{openAIKey}}",
    "dimensions": 3072,
    "modelName": "text-embedding-3-large"
    },
    {
      "@odata.type": "#Microsoft.Skills.Util.ShaperSkill",
      "name": "shaper-skill",
      "description": "Shaper skill to reshape the data to fit the index schema",
      "context": "/document/normalized_images/*",
      "inputs": [
        {
          "name": "normalized_images",
          "source": "/document/normalized_images/*",
          "inputs": []
        },
        {
          "name": "imagePath",
          "source": "='{{imageProjectionContainer}}/'+$(/document/normalized_images/*/imagePath)",
          "inputs": []
        },
        {
          "name": "location_metadata",
          "sourceContext": "/document/normalized_images/*",
          "inputs": [
            {
              "name": "page_number",
              "source": "/document/normalized_images/*/pageNumber"
            },
            {
              "name": "bounding_polygons",
              "source": "/document/normalized_images/*/boundingPolygon"
            }              
          ]
        }        
      ],
      "outputs": [
        {
          "name": "output",
          "targetName": "new_normalized_images"
        }
      ]
    }      
  ], 
   "indexProjections": {
      "selectors": [
        {
          "targetIndexName": "doc-extraction-image-verbalization-index",
          "parentKeyFieldName": "text_document_id",
          "sourceContext": "/document/pages/*",
          "mappings": [    
            {
            "name": "content_embedding",
            "source": "/document/pages/*/text_vector"
            },                      
            {
              "name": "content_text",
              "source": "/document/pages/*"
            },             
            {
              "name": "document_title",
              "source": "/document/document_title"
            }   
          ]
        },        
        {
          "targetIndexName": "doc-extraction-image-verbalization-index",
          "parentKeyFieldName": "image_document_id",
          "sourceContext": "/document/normalized_images/*",
          "mappings": [    
            {
            "name": "content_text",
            "source": "/document/normalized_images/*/verbalizedImage"
            },  
            {
            "name": "content_embedding",
            "source": "/document/normalized_images/*/verbalizedImage_vector"
            },                                           
            {
              "name": "content_path",
              "source": "/document/normalized_images/*/new_normalized_images/imagePath"
            },                    
            {
              "name": "document_title",
              "source": "/document/document_title"
            },
            {
              "name": "locationMetadata",
              "source": "/document/normalized_images/*/new_normalized_images/location_metadata"
            }            
          ]
        }
      ],
      "parameters": {
        "projectionMode": "skipIndexingParentDocuments"
      }
  },  
  "knowledgeStore": {
    "storageConnectionString": "{{storageConnection}}",
    "identity": null,
    "projections": [
      {
        "files": [
          {
            "storageContainer": "{{imageProjectionContainer}}",
            "source": "/document/normalized_images/*"
          }
        ]
      }
    ]
  }
}

このスキルセットは、テキストと画像を抽出し、両方をベクター化し、インデックスに投影するための画像メタデータを図形化します。

重要なポイント:

content_text フィールドには、次の 2 つの方法で入力されます。
- ドキュメント抽出スキルで抽出され、テキスト分割スキルで分割されたドキュメントテキストから
- GenAI Prompt スキルを使用した画像コンテンツから、正規化された画像ごとに説明的なキャプションを生成します
content_embedding フィールドには、ページテキストと言語化された画像の説明の両方に対して 3072 次元の埋め込み機能が含まれています。これらは、Azure OpenAI のテキスト埋め込み 3-large モデルを使用して生成されます。
content_path には、指定されたイメージプロジェクションコンテナー内のイメージファイルへの相対パスが含まれています。このフィールドは、 imageAction が generateNormalizedImagesに設定されている場合に PDF から抽出されたイメージに対してのみ生成され、ソースフィールド /document/normalized_images/*/imagePathからエンリッチメントされたドキュメントからマップできます。

インデクサーの作成と実行

インデクサーの作成では、検索サービスにインデクサーを作成します。インデクサーは、データソースに接続し、データを読み込み、スキルセットを実行し、エンリッチされたデータのインデックスを作成します。

### Create and run an indexer
POST {{searchUrl}}/indexers?api-version=2025-11-01-preview   HTTP/1.1
  Content-Type: application/json
  api-key: {{searchApiKey}}

{
  "dataSourceName": "doc-extraction-image-verbalization-ds",
  "targetIndexName": "doc-extraction-image-verbalization-index",
  "skillsetName": "doc-extraction-image-verbalization-skillset",
  "parameters": {
    "maxFailedItems": -1,
    "maxFailedItemsPerBatch": 0,
    "batchSize": 1,
    "configuration": {
      "allowSkillsetToReadFileData": true
    }
  },
  "fieldMappings": [
    {
      "sourceFieldName": "metadata_storage_name",
      "targetFieldName": "document_title"
    }
  ],
  "outputFieldMappings": []
}

クエリを実行する

最初のドキュメントが読み込まれたらすぐに、検索を始めることができます。

### Query the index
POST {{searchUrl}}/indexes/doc-extraction-image-verbalization-index/docs/search?api-version=2025-11-01-preview   HTTP/1.1
  Content-Type: application/json
  api-key: {{searchApiKey}}
  
  {
    "search": "*",
    "count": true
  }

要求を送信します。これは、インデックスで取得可能としてマークされているすべてのフィールドとドキュメント数を返す、指定されていないフルテキスト検索クエリです。応答は次のようになります。

HTTP/1.1 200 OK
Transfer-Encoding: chunked
Content-Type: application/json; odata.metadata=minimal; odata.streaming=true; charset=utf-8
Content-Encoding: gzip
Vary: Accept-Encoding
Server: Microsoft-IIS/10.0
Strict-Transport-Security: max-age=2592000, max-age=15724800; includeSubDomains
Preference-Applied: odata.include-annotations="*"
OData-Version: 4.0
request-id: 712ca003-9493-40f8-a15e-cf719734a805
elapsed-time: 198
Date: Wed, 30 Apr 2025 23:20:53 GMT
Connection: close

{
  "@odata.count": 100,
  "@search.nextPageParameters": {
    "search": "*",
    "count": true,
    "skip": 50
  },
  "value": [
  ],
  "@odata.nextLink": "https://<YOUR-SEARCH-SERVICE-NAME>.search.windows.net/indexes/doc-extraction-image-verbalization-index/docs/search?api-version=2025-11-01-preview "
}

応答で 100 個のドキュメントが返されます。

フィルター処理のため、論理演算子 (and、or、not) と比較演算子 (eq、ne、gt、lt、ge、le) を使用することもできます。文字列比較では大文字と小文字が区別されます。詳細と例については、単純な検索クエリの例を参照してください。

Note

$filter パラメーターは、インデックスの作成時にフィルター可能としてマークされたフィールドでのみ機能します。

他のクエリの例を次に示します。

### Query for only images
POST {{searchUrl}}/indexes/doc-extraction-image-verbalization-index/docs/search?api-version=2025-11-01-preview   HTTP/1.1
  Content-Type: application/json
  api-key: {{searchApiKey}}
  
  {
    "search": "*",
    "count": true,
    "filter": "image_document_id ne null"
  }

### Query for text or images with content related to energy, returning the id, parent document, and text (extracted text for text chunks and verbalized image text for images), and the content path where the image is saved in the knowledge store (only populated for images)
POST {{searchUrl}}/indexes/doc-extraction-image-verbalization-index/docs/search?api-version=2025-11-01-preview   HTTP/1.1
  Content-Type: application/json
  api-key: {{searchApiKey}}
  
  {
    "search": "energy",
    "count": true,
    "select": "content_id, document_title, content_text, content_path"
  }

リセットして再実行する

インデクサーをリセットして高基準値をクリアすると、完全な再実行が可能になります。次の POST 要求はリセット用であり、その後に再実行されます。

### Reset the indexer
POST {{searchUrl}}/indexers/doc-extraction-image-verbalization-indexer/reset?api-version=2025-11-01-preview   HTTP/1.1
  api-key: {{searchApiKey}}

### Run the indexer
POST {{searchUrl}}/indexers/doc-extraction-image-verbalization-indexer/run?api-version=2025-11-01-preview   HTTP/1.1
  api-key: {{searchApiKey}}

### Check indexer status 
GET {{searchUrl}}/indexers/doc-extraction-image-verbalization-indexer/status?api-version=2025-11-01-preview   HTTP/1.1
  api-key: {{searchApiKey}}

リソースをクリーンアップする

所有するサブスクリプションを使用している場合は、プロジェクトの終了時に、不要になったリソースを削除することをお勧めします。リソースを稼働させたままにすると、費用がかかる場合があります。リソースを個別に削除することも、リソースグループを削除してリソースのセット全体を削除することもできます。

Azure portal を使って、インデックス、インデクサー、データソースを削除できます。

こちらも参照ください

マルチモーダルインデックス作成シナリオのサンプル実装に慣れたので、次の点を確認してください。

フィードバック

このページはお役に立ちましたか?

Last updated on 2025-08-27