從 AutoML 使用 ONNX 在電腦視覺模型上進行預測 (v1)

適用於：適用於 Python 的 Azure Machine Learning SDK v1

重要事項

本文中的 Azure CLI 命令使用 azure-cli-ml 或 v1 (Azure Machine Learning 的擴充功能)。 CLI v1 的支援已於 2025 年 9 月 30 日結束。 Microsoft 將不再提供此服務的技術支援或更新。您使用 CLI v1 的現有工作流程將在支援終止日期之後繼續運作。不過，如果產品發生架構變更，它們可能會面臨安全性風險或重大變更。

建議您盡快轉換至 ml或 v2 擴充功能。如需有關 v2 擴充功能的詳細資訊，請參閱 Azure Machine Learning CLI 擴充功能和 Python SDK v2。

重要事項

本文提供使用 Azure Machine Learning SDK v1 的相關信息。 SDK v1 自 2025 年 3 月 31 日起已被取代。其支援將於 2026 年 6 月 30 日結束。您可以在該日期之前安裝並使用 SDK v1。您使用 SDK v1 的現有工作流程將在支援終止日期後繼續運作。不過，如果產品發生架構變更，它們可能會面臨安全性風險或重大變更。

建議您在 2026 年 6 月 30 日之前轉換至 SDK v2。如需 SDK v2 的詳細資訊，請參閱什麼是 Azure Machine Learning CLI 和 Python SDK v2？和 SDK v2 參考。

在本文中，您將了解如何使用 Open Neural Network Exchange (ONNX)，在 Azure Machine Learning 中對自動化機器學習 (AutoML) 產生的電腦視覺模型做出預測。

若要使用 ONNX 進行預測，您需要：

從 AutoML 定型回合下載 ONNX 模型檔案。
了解 ONNX 模型的輸入和輸出。
前置處理資料，使其成為輸入影像的必要格式。
使用適用於 Python 的 ONNX Runtime 來執行推斷。
對物件偵測和執行個體分割工作進行視覺化的預測。

ONNX 是機器學習和深度學習的開放標準。它能夠在熱門的 AI 架構中，進行模型匯入和匯出 (互通性)。如需詳細資訊，請探索 ONNX GitHub 專案。

ONNX Runtime 是支援跨平台推斷的開放原始碼專案。 ONNX Runtime 提供 API 跨平台的程式設計語言 (包括 Python、C++、C#、C、Java 和 JavaScript)。您可以使用這些 API 在輸入影像上執行推斷。當您具備已匯出為 ONNX 格式的模型之後，即可在您專案所需的任何程式設計語言上使用這些 API。

在本指南中，您將了解如何使用適用於 ONNX Runtime 的 Python API，對熱門視覺工作的影像進行預測。您可以跨語言使用這些 ONNX 匯出的模型。

先決條件

針對任何支援的影像工作取得 AutoML 定型的電腦視覺模型：分類、物件偵測或執行個體分割。深入了解電腦視覺工作的 AutoML 支援。
安裝 onnxruntime 套件。本文中的方法已經過版本 1.3.0 至 1.8.0 的測試。

下載 ONNX 模型檔案

您可以使用 Azure Machine Learning 工作室 UI 或 Azure Machine Learning Python SDK，從 AutoML 執行下載 ONNX 模型檔案。建議您使用實驗名稱和父代執行識別碼，透過 SDK 進行下載。

Azure Machine Learning 工作室

在 Azure Machine Learning 工作室中，使用定型筆記本產生的實驗超連結，或在[資產] 下的[實驗] 索引標籤上，選取實驗名稱，前往您的實驗。然後選取最佳的子系執行。

在最佳的子系執行中，前往 Outputs+logs>train_artifacts。使用 [下載] 按鈕，手動下載以下檔案：

labels.json：此檔案包含定型資料集中的所有類別或標籤。
model.onnx：ONNX 格式的模型。

此螢幕擷取畫面顯示下載 ONNX 模型檔案的選擇範圍。

將下載的模型檔案儲存在目錄中。本文中的範例使用 ./automl_models 目錄。

Azure Machine Learning Python SDK

透過 SDK，您可以使用實驗名稱和父代執行識別碼，選取最佳子系執行 (根據主要計量)。然後，您可以下載 labels.json 和 model.onnx 檔案。

下列程式碼會根據相關的主要計量傳回最佳的子系執行。

from azureml.train.automl.run import AutoMLRun

# Select the best child run
run_id = '' # Specify the run ID
automl_image_run = AutoMLRun(experiment=experiment, run_id=run_id)
best_child_run = automl_image_run.get_best_child()

下載 labels.json 檔案，其包含定型資料集中的所有類別和標籤。

labels_file = 'automl_models/labels.json'
best_child_run.download_file(name='train_artifacts/labels.json', output_file_path=labels_file)

下載 model.onnx 檔案。

onnx_model_path = 'automl_models/model.onnx'
best_child_run.download_file(name='train_artifacts/model.onnx', output_file_path=onnx_model_path)

產生模型以進行批次評分

AutoML for Images 預設支援分類的批次評分。不過，物件偵測和執行個體分割模型不支援批次推斷。如果是物件偵測和執行個體分割的批次推斷，請使用下列程序為所需的批次大小產生 ONNX 模型。為特定批次大小產生的模型，不適用於其他批次大小。

from azureml.core.script_run_config import ScriptRunConfig
from azureml.train.automl.run import AutoMLRun
from azureml.core.workspace import Workspace
from azureml.core import Experiment

# specify experiment name
experiment_name = ''
# specify workspace parameters
subscription_id = ''
resource_group = ''
workspace_name = ''
# load the workspace and compute target
ws = ''
compute_target = ''
experiment = Experiment(ws, name=experiment_name)

# specify the run id of the automl run
run_id = ''
automl_image_run = AutoMLRun(experiment=experiment, run_id=run_id)
best_child_run = automl_image_run.get_best_child()

使用下列模型特定引數來提交指令碼。如需引數的詳細資訊，請參閱模型特定超參數，若要了解支援的物件偵測模型名稱，請參閱支援的模型演算法小節。

若要取得建立批次評分模型所需的引數值，請參閱 AutoML 定型執行的輸出資料夾下產生的評分指令碼。為了獲得最佳子執行，請使用評分檔案內的模型設定變數中可用的超參數值。

針對多類別影像分類，為最佳子執行產生的 ONNX 模型依預設可支援批次評分。因此，此工作類型不需要任何模型特定的引數，您可以跳至載入標籤和 ONNX 模型檔案一節。

arguments = ['--model_name', 'fasterrcnn_resnet34_fpn',  # enter the faster rcnn or retinanet model name
             '--batch_size', 8,  # enter the batch size of your choice
             '--height_onnx', 600,  # enter the height of input to ONNX model
             '--width_onnx', 800,  # enter the width of input to ONNX model
             '--experiment_name', experiment_name,
             '--subscription_id', subscription_id,
             '--resource_group', resource_group,
             '--workspace_name', workspace_name,
             '--run_id', run_id,
             '--task_type', 'image-object-detection',
             '--min_size', 600,  # minimum size of the image to be rescaled before feeding it to the backbone
             '--max_size', 1333,  # maximum size of the image to be rescaled before feeding it to the backbone
             '--box_score_thresh', 0.3,  # threshold to return proposals with a classification score > box_score_thresh
             '--box_nms_thresh', 0.5,  # NMS threshold for the prediction head
             '--box_detections_per_img', 100  # maximum number of detections per image, for all classes
             ]

arguments = ['--model_name', 'yolov5',  # enter the yolo model name
             '--batch_size', 8,  # enter the batch size of your choice
             '--height_onnx', 640,  # enter the height of input to ONNX model
             '--width_onnx', 640,  # enter the width of input to ONNX model
             '--experiment_name', experiment_name,
             '--subscription_id', subscription_id,
             '--resource_group', resource_group,
             '--workspace_name', workspace_name,
             '--run_id', run_id,
             '--task_type', 'image-object-detection',
             '--img_size', 640,  # image size for inference
             '--model_size', 'medium',  # size of the yolo model
             '--box_score_thresh', 0.1,  # threshold to return proposals with a classification score > box_score_thresh
             '--box_iou_thresh', 0.5  # IOU threshold used during inference in nms post processing
             ]

arguments = ['--model_name', 'maskrcnn_resnet50_fpn',  # enter the maskrcnn model name
             '--batch_size', 8,  # enter the batch size of your choice
             '--height_onnx', 600,  # enter the height of input to ONNX model
             '--width_onnx', 800,  # enter the width of input to ONNX model
             '--experiment_name', experiment_name,
             '--subscription_id', subscription_id,
             '--resource_group', resource_group,
             '--workspace_name', workspace_name,
             '--run_id', run_id,
             '--task_type', 'image-instance-segmentation',
             '--min_size', 600,  # minimum size of the image to be rescaled before feeding it to the backbone
             '--max_size', 1333,  # maximum size of the image to be rescaled before feeding it to the backbone
             '--box_score_thresh', 0.3,  # threshold to return proposals with a classification score > box_score_thresh
             '--box_nms_thresh', 0.5,  # NMS threshold for the prediction head
             '--box_detections_per_img', 100  # maximum number of detections per image, for all classes
             ]

下載 ONNX_batch_model_generator_automl_for_images.py 檔案並將其保存在目前的目錄中，然後提交指令碼。使用 ScriptRunConfig 提交 ONNX_batch_model_generator_automl_for_images.py中可用的指令碼，以產生特定批次大小的 ONNX 模型。在下列程式碼中，定型的模型環境可用來提交此指令碼，以產生 ONNX 模型並將其儲存至輸出目錄。

script_run_config = ScriptRunConfig(source_directory='.',
                                    script='ONNX_batch_model_generator_automl_for_images.py',
                                    arguments=arguments,
                                    compute_target=compute_target,
                                    environment=best_child_run.get_environment())

remote_run = experiment.submit(script_run_config)
remote_run.wait_for_completion(wait_post_processing=True)

產生批次模型之後，請從 [輸出 + 記錄]>[輸出] 手動加以下載，或使用下列方法：

batch_size= 8  # use the batch size used to generate the model
onnx_model_path = 'automl_models/model.onnx'  # local path to save the model
remote_run.download_file(name='outputs/model_'+str(batch_size)+'.onnx', output_file_path=onnx_model_path)

在模型下載步驟之後，可以藉由使用 model.onnx 檔案，來使用 ONNX Runtime Python 套件以執行推斷。基於示範目的，本文對於每個視覺工作使用來自如何準備影像資料集的資料集。

我們已使用其各自的資料集來定型所有視覺工作的模型，以示範 ONNX 模型推斷。

載入標籤和 ONNX 模型檔案

下列程式碼片段會載入 labels.json，其類別名稱已排序。也就是說，如果 ONNX 模型預測標籤識別碼為 2，則會對應至 labels.json 檔案中第三個索引指定的標籤名稱。

import json
import onnxruntime

labels_file = "automl_models/labels.json"
with open(labels_file) as f:
    classes = json.load(f)
print(classes)
try:
    session = onnxruntime.InferenceSession(onnx_model_path)
    print("ONNX model loaded...")
except Exception as e: 
    print("Error loading ONNX file: ",str(e))

取得 ONNX 模型的預期輸入和輸出詳細資料

當您擁有模型時，請務必了解某些特定模型和特定工作的詳細資料。這些詳細資料包括輸入和輸出的數目、處理影像的預期輸入圖形或格式，以及輸出圖形，以便您知道特定模型或特定工作的輸出。

sess_input = session.get_inputs()
sess_output = session.get_outputs()
print(f"No. of inputs : {len(sess_input)}, No. of outputs : {len(sess_output)}")

for idx, input_ in enumerate(range(len(sess_input))):
    input_name = sess_input[input_].name
    input_shape = sess_input[input_].shape
    input_type = sess_input[input_].type
    print(f"{idx} Input name : { input_name }, Input shape : {input_shape}, \
    Input type  : {input_type}")  

for idx, output in enumerate(range(len(sess_output))):
    output_name = sess_output[output].name
    output_shape = sess_output[output].shape
    output_type = sess_output[output].type
    print(f" {idx} Output name : {output_name}, Output shape : {output_shape}, \
    Output type  : {output_type}")

ONNX 模型的預期輸入和輸出格式

每個 ONNX 模型都有一組預先定義的輸入和輸出格式。

此範例會套用在 fridgeObjects 資料集上定型的模型，其中包含 134 個影像和 4 個類別/標籤，以說明 ONNX 模型推斷。如需定型影像分類工作的詳細資訊，請參閱多類別影像分類筆記本。

輸入格式

輸入是前置處理過的影像。

輸入名稱	輸入圖形	輸入類型	描述
輸入1	`(batch_size, num_channels, height, width)`	ndarray(float)	輸入是前置處理過的影像，其圖形 `(1, 3, 224, 224)` 的批次大小為 1，高度和寬度為 224。這些數字對應至定型範例中 `crop_size` 所使用的值。

輸出格式

輸出是所有類別/標籤的 logits 陣列。

輸出名稱	輸出圖形	輸出類型	描述
output1	`(batch_size, num_classes)`	ndarray(float)	模型會傳回 logits (不含 `softmax`)。例如，對於每個批次大小為 1 和 4 個類別，會傳回 `(1, 4)`。

此範例會使用多標籤 fridgeObjects 資料集上定型的模型，其包含 128 個影像和 4 個類別/標籤，以說明 ONNX 模型推斷。如需多標籤影像分類之模型定型的詳細資訊，請參閱多標籤影像分類筆記本。

輸入格式

輸入是前置處理過的影像。

輸入名稱	輸入圖形	輸入類型	描述
輸入1	`(batch_size, num_channels, height, width)`	ndarray(float)	輸入是前置處理過的影像，其圖形 `(1, 3, 224, 224)` 的批次大小為 1，高度和寬度為 224。這些數字對應至定型範例中 `crop_size` 所使用的值。

輸出格式

輸出是所有類別/標籤的 logits 陣列。

輸出名稱	輸出圖形	輸出類型	描述
output1	`(batch_size, num_classes)`	ndarray(float)	模型會傳回 logits (不含 `sigmoid`)。例如，對於每個批次大小為 1 和 4 個類別，會傳回 `(1, 4)`。

此物件偵測範例會使用在 fridgeObjects 偵測資料集上定型的模型，其包含 128 個影像和 4 個分類/標籤，以說明 ONNX 模型推斷。此範例會定型更快速的 R-CNN 模型，以示範推斷步驟。如需定型物件偵測模型的詳細資訊，請參閱物件偵測筆記本。

輸入格式

輸入是前置處理過的影像。

輸入名稱	輸入圖形	輸入類型	描述
輸入	`(batch_size, num_channels, height, width)`	ndarray(float)	輸入是前置處理過的影像，其圖形 `(1, 3, 600, 800)` 的批次大小為 1，高度為 600，寬度為 800。

輸出格式

輸出是 output_names 和預測的元組。在此，output_names 和 predictions 分別是長度為 3*batch_size 的清單。 Faster R-CNN 的輸出順序為方塊、標籤和分數，而 RetinaNet 輸出則為方塊、分數、標籤。

輸出名稱	輸出圖形	輸出類型	描述
`output_names`	`(3*batch_size)`	索引鍵清單	若批次大小為 2，`output_names` 將是 `['boxes_0', 'labels_0', 'scores_0', 'boxes_1', 'labels_1', 'scores_1']`
`predictions`	`(3*batch_size)`	ndarray(float) 清單	若批次大小為 2，`predictions` 的圖形將是 `[(n1_boxes, 4), (n1_boxes), (n1_boxes), (n2_boxes, 4), (n2_boxes), (n2_boxes)]`。在此處，每個索引的值都會對應至 `output_names` 中的相同索引。

下表說明為影像批次中的每個範例傳回的方塊、標籤和分數。

名稱	形狀	類型	描述
方塊	`(n_boxes, 4)`，其中每個方塊具有 `x_min, y_min, x_max, y_max`	ndarray(float)	模型會傳回 n 個方塊，包含其左上角和右下角的座標。
標籤	`(n_boxes)`	ndarray(float)	每個方塊中物件的標籤或類別識別碼。
分數	`(n_boxes)`	ndarray(float)	每個方塊中物件的信賴分數。

此物件偵測範例會使用在 fridgeObjects 偵測資料集上定型的模型，其包含 128 個影像和 4 個分類/標籤，以說明 ONNX 模型推斷。此範例會定型 YOLO 模型，以示範推斷步驟。如需定型物件偵測模型的詳細資訊，請參閱物件偵測筆記本。

輸入格式

輸入是前置處理過的影像，其圖形 (1, 3, 640, 640) 的批次大小為 1，高度和寬度為 640。這些數字對應至定型範例中所使用的值。

輸入名稱	輸入圖形	輸入類型	描述
輸入	`(batch_size, num_channels, height, width)`	ndarray(float)	輸入是前置處理過的影像，其圖形 `(1, 3, 640, 640)` 的批次大小為 1，高度為 640，寬度為 640。

輸出格式

ONNX 模型預測包含多個輸出。需要第一個輸出，才能對偵測執行非最大抑制。為了方便使用，自動化 ML 會在 NMS 後置處理步驟之後顯示輸出格式。 NMS 之後的輸出是批次中每個範例的方塊、標籤和分數的清單。

輸出名稱	輸出圖形	輸出類型	描述
輸出	`(batch_size)`	ndarray(float) 清單	模型會傳回批次中每個範例的方塊偵測

清單中每個資料格分別指出具有圖形 (n_boxes, 6) 之範例的方塊偵測，其中每個方塊都有 x_min, y_min, x_max, y_max, confidence_score, class_id。

對於此執行個體分割範例，您可使用已在 fridgeObjects 資料集上定型的遮罩 R-CNN 模型，其包含 128 個影像和 4 個類別/標籤，以說明 ONNX 模型推斷。如需有關如何定型執行個體分割模型的詳細資訊，請參閱執行個體分割筆記本。

重要事項

執行個體分割工作僅支援遮罩 R-CNN。輸入和輸出格式僅以遮罩 R-CNN 為基礎。

輸入格式

輸入是前置處理過的影像。已匯出遮罩 R-CNN 的 ONNX 模型，以便使用不同圖形的影像。建議您將其大小調整為與定型影像一致的固定大小，以獲得更好的效能。

輸入名稱	輸入圖形	輸入類型	描述
輸入	`(batch_size, num_channels, height, width)`	ndarray(float)	輸入是前置處理過的影像，其圖形 `(1, 3, input_image_height, input_image_width)` 的批次大小為 1，高度和寬度類似於輸入影像。

輸出格式

輸出是 output_names 和預測的元組。在此，output_names 和 predictions 分別是長度為 4*batch_size 的清單。

輸出名稱	輸出圖形	輸出類型	描述
`output_names`	`(4*batch_size)`	索引鍵清單	若批次大小為 2，`output_names` 將是 `['boxes_0', 'labels_0', 'scores_0', 'masks_0', 'boxes_1', 'labels_1', 'scores_1', 'masks_1']`
`predictions`	`(4*batch_size)`	ndarray(float) 清單	若批次大小為 2，`predictions` 的圖形將是 `[(n1_boxes, 4), (n1_boxes), (n1_boxes), (n1_boxes, 1, height_onnx, width_onnx), (n2_boxes, 4), (n2_boxes), (n2_boxes), (n2_boxes, 1, height_onnx, width_onnx)]`。在此處，每個索引的值都會對應至 `output_names` 中的相同索引。

名稱	形狀	類型	描述
方塊	`(n_boxes, 4)`，其中每個方塊具有 `x_min, y_min, x_max, y_max`	ndarray(float)	模型會傳回 n 個方塊，包含其左上角和右下角的座標。
標籤	`(n_boxes)`	ndarray(float)	每個方塊中物件的標籤或類別識別碼。
分數	`(n_boxes)`	ndarray(float)	每個方塊中物件的信賴分數。
遮罩	`(n_boxes, 1, height_onnx, width_onnx)`	ndarray(float)	已偵測物件的遮罩 (多邊形)，其使用輸入影像的圖形高度和寬度。

前置處理

針對 ONNX 模型推斷，執行下列前置處理步驟：

將影像轉換為 RGB。
將影像大小調整為 valid_resize_size 和 valid_resize_size 值，使其對應至定型期間轉換驗證資料集所使用的值。 valid_resize_size 的預設值為 256。
將影像置中裁切成 height_onnx_crop_size 和 width_onnx_crop_size。它會對應至 valid_crop_size，其預設值為 224。
將 HxWxC 變更為 CxHxW。
轉換成 float 類型。
使用 ImageNet 的 mean = [0.485, 0.456, 0.406] 和 std = [0.229, 0.224, 0.225] 進行正規化。

如果您在定型期間，為超參數valid_resize_size和valid_crop_size選擇不同的值，則會使用這些值。

取得 ONNX 模型所需的輸入圖形。

batch, channel, height_onnx_crop_size, width_onnx_crop_size = session.get_inputs()[0].shape
batch, channel, height_onnx_crop_size, width_onnx_crop_size

不含 PyTorch

import glob
import numpy as np
from PIL import Image

def preprocess(image, resize_size, crop_size_onnx):
    """Perform pre-processing on raw input image
    
    :param image: raw input image
    :type image: PIL image
    :param resize_size: value to resize the image
    :type image: Int
    :param crop_size_onnx: expected height of an input image in onnx model
    :type crop_size_onnx: Int
    :return: pre-processed image in numpy format
    :rtype: ndarray 1xCxHxW
    """

    image = image.convert('RGB')
    # resize
    image = image.resize((resize_size, resize_size))
    #  center  crop
    left = (resize_size - crop_size_onnx)/2
    top = (resize_size - crop_size_onnx)/2
    right = (resize_size + crop_size_onnx)/2
    bottom = (resize_size + crop_size_onnx)/2
    image = image.crop((left, top, right, bottom))

    np_image = np.array(image)
    # HWC -> CHW
    np_image = np_image.transpose(2, 0, 1) # CxHxW
    # normalize the image
    mean_vec = np.array([0.485, 0.456, 0.406])
    std_vec = np.array([0.229, 0.224, 0.225])
    norm_img_data = np.zeros(np_image.shape).astype('float32')
    for i in range(np_image.shape[0]):
        norm_img_data[i,:,:] = (np_image[i,:,:]/255 - mean_vec[i])/std_vec[i]
             
    np_image = np.expand_dims(norm_img_data, axis=0) # 1xCxHxW
    return np_image

# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images

test_images_path = "automl_models_multi_cls/test_images_dir/*" # replace with path to images
# Select batch size needed
batch_size = 8
# you can modify resize_size based on your trained model
resize_size = 256
# height and width will be the same for classification
crop_size_onnx = height_onnx_crop_size 

image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
    img = Image.open(image_files[i])
    img_processed_list.append(preprocess(img, resize_size, crop_size_onnx))
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

包含 PyTorch

import glob
import torch
import numpy as np
from PIL import Image
from torchvision import transforms

def _make_3d_tensor(x) -> torch.Tensor:
    """This function is for images that have less channels.

    :param x: input tensor
    :type x: torch.Tensor
    :return: return a tensor with the correct number of channels
    :rtype: torch.Tensor
    """
    return x if x.shape[0] == 3 else x.expand((3, x.shape[1], x.shape[2]))

def preprocess(image, resize_size, crop_size_onnx):
    transform = transforms.Compose([
        transforms.Resize(resize_size),
        transforms.CenterCrop(crop_size_onnx),
        transforms.ToTensor(),
        transforms.Lambda(_make_3d_tensor),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])
    
    img_data = transform(image)
    img_data = img_data.numpy()
    img_data = np.expand_dims(img_data, axis=0)
    return img_data

# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images

test_images_path = "automl_models_multi_cls/test_images_dir/*" # replace with path to images
# Select batch size needed
batch_size = 8
# you can modify resize_size based on your trained model
resize_size = 256
# height and width will be the same for classification
crop_size_onnx = height_onnx_crop_size 

image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
    img = Image.open(image_files[i])
    img_processed_list.append(preprocess(img, resize_size, crop_size_onnx))
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

針對 ONNX 模型推斷，執行下列前置處理步驟。這些步驟對多類別影像分類而言是相同的。

將影像轉換為 RGB。
將影像大小調整為 valid_resize_size 和 valid_resize_size 值，使其對應至定型期間轉換驗證資料集所使用的值。 valid_resize_size 的預設值為 256。
將影像置中裁切成 height_onnx_crop_size 和 width_onnx_crop_size。此會對應至 valid_crop_size，其預設值為 224。
將 HxWxC 變更為 CxHxW。
轉換成 float 類型。
使用 ImageNet 的 mean = [0.485, 0.456, 0.406] 和 std = [0.229, 0.224, 0.225] 進行正規化。

如果您在定型期間，為超參數valid_resize_size和valid_crop_size選擇不同的值，則會使用這些值。

取得 ONNX 模型所需的輸入圖形。

batch, channel, height_onnx_crop_size, width_onnx_crop_size = session.get_inputs()[0].shape
batch, channel, height_onnx_crop_size, width_onnx_crop_size

不含 PyTorch

import glob
import numpy as np
from PIL import Image

def preprocess(image, resize_size, crop_size_onnx):
    """Perform pre-processing on raw input image
    
    :param image: raw input image
    :type image: PIL image
    :param resize_size: value to resize the image
    :type image: Int
    :param crop_size_onnx: expected height of an input image in onnx model
    :type crop_size_onnx: Int
    :return: pre-processed image in numpy format
    :rtype: ndarray 1xCxHxW
    """

    image = image.convert('RGB')
    # resize
    image = image.resize((resize_size, resize_size))
    # center  crop
    left = (resize_size - crop_size_onnx)/2
    top = (resize_size - crop_size_onnx)/2
    right = (resize_size + crop_size_onnx)/2
    bottom = (resize_size + crop_size_onnx)/2
    image = image.crop((left, top, right, bottom))

    np_image = np.array(image)
    # HWC -> CHW
    np_image = np_image.transpose(2, 0, 1) # CxHxW

    # normalize the image
    mean_vec = np.array([0.485, 0.456, 0.406])
    std_vec = np.array([0.229, 0.224, 0.225])
    norm_img_data = np.zeros(np_image.shape).astype('float32')
    for i in range(np_image.shape[0]):
        norm_img_data[i,:,:] = (np_image[i,:,:] / 255 - mean_vec[i]) / std_vec[i]    
    np_image = np.expand_dims(norm_img_data, axis=0) # 1xCxHxW
    return np_image

# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images

test_images_path = "automl_models_multi_label/test_images_dir/*" # replace with path to images
# Select batch size needed
batch_size = 8
# you can modify resize_size based on your trained model
resize_size = 256
# height and width will be the same for classification
crop_size_onnx = height_onnx_crop_size 

image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
    img = Image.open(image_files[i])
    img_processed_list.append(preprocess(img, resize_size, crop_size_onnx))
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

包含 PyTorch

import glob
import torch
import numpy as np
from PIL import Image
from torchvision import transforms

def _make_3d_tensor(x) -> torch.Tensor:
    """This function is for images that have less channels.

    :param x: input tensor
    :type x: torch.Tensor
    :return: return a tensor with the correct number of channels
    :rtype: torch.Tensor
    """
    return x if x.shape[0] == 3 else x.expand((3, x.shape[1], x.shape[2]))

def preprocess(image, resize_size, crop_size_onnx):
    """Perform pre-processing on raw input image
    
    :param image: raw input image
    :type image: PIL image
    :param resize_size: value to resize the image
    :type image: Int
    :param crop_size_onnx: expected height of an input image in onnx model
    :type crop_size_onnx: Int
    :return: pre-processed image in numpy format
    :rtype: ndarray 1xCxHxW
    """
    transform = transforms.Compose([
        transforms.Resize(resize_size),
        transforms.CenterCrop(crop_size_onnx),
        transforms.ToTensor(),
        transforms.Lambda(_make_3d_tensor),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])
    
    img_data = transform(image)
    img_data = img_data.numpy()
    img_data = np.expand_dims(img_data, axis=0)
    
    return img_data

# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images

test_images_path = "automl_models_multi_label/test_images_dir/*" # replace with path to images
# Select batch size needed
batch_size = 8
# you can modify resize_size based on your trained model
resize_size = 256
# height and width will be the same for classification
crop_size_onnx = height_onnx_crop_size 

image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
    img = Image.open(image_files[i])
    img_processed_list.append(preprocess(img, resize_size, crop_size_onnx))
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

若要使用更快速的 R-CNN 演算法來進行物件偵測，請依照影像分類的相同前置處理步驟操作，但影像裁切除外。您可以調整影像大小，高度為 600，寬度為 800。您可以使用下列程式碼來取得預期的輸入高度和寬度。

batch, channel, height_onnx, width_onnx = session.get_inputs()[0].shape
batch, channel, height_onnx, width_onnx

然後，執行前置處理步驟。

import glob
import numpy as np
from PIL import Image

def preprocess(image, height_onnx, width_onnx):
    """Perform pre-processing on raw input image
    
    :param image: raw input image
    :type image: PIL image
    :param height_onnx: expected height of an input image in onnx model
    :type height_onnx: Int
    :param width_onnx: expected width of an input image in onnx model
    :type width_onnx: Int
    :return: pre-processed image in numpy format
    :rtype: ndarray 1xCxHxW
    """

    image = image.convert('RGB')
    image = image.resize((width_onnx, height_onnx))
    np_image = np.array(image)
    # HWC -> CHW
    np_image = np_image.transpose(2, 0, 1) # CxHxW
    # normalize the image
    mean_vec = np.array([0.485, 0.456, 0.406])
    std_vec = np.array([0.229, 0.224, 0.225])
    norm_img_data = np.zeros(np_image.shape).astype('float32')
    for i in range(np_image.shape[0]):
        norm_img_data[i,:,:] = (np_image[i,:,:] / 255 - mean_vec[i]) / std_vec[i]
    np_image = np.expand_dims(norm_img_data, axis=0) # 1xCxHxW
    return np_image

# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images

test_images_path = "automl_models_od/test_images_dir/*" # replace with path to images
image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
    img = Image.open(image_files[i])
    img_processed_list.append(preprocess(img, height_onnx, width_onnx))
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

若要使用 YOLO 演算法來進行物件偵測，請依照影像分類的相同前置處理步驟操作，但影像裁切除外。您可以調整影像大小，高度為 600，寬度為 800，並使用下列程式碼還取得預期的輸入高度和寬度。

batch, channel, height_onnx, width_onnx = session.get_inputs()[0].shape
batch, channel, height_onnx, width_onnx

如需了解 YOLO 所需的前置處理，請參閱 yolo_onnx_preprocessing_utils.py。

import glob
import numpy as np
from yolo_onnx_preprocessing_utils import preprocess

# use height and width based on the generated model
test_images_path = "automl_models_od_yolo/test_images_dir/*" # replace with path to images
image_files = glob.glob(test_images_path)
img_processed_list = []
pad_list = []
for i in range(batch_size):
    img_processed, pad = preprocess(image_files[i])
    img_processed_list.append(img_processed)
    pad_list.append(pad)
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

重要事項

執行個體分割工作僅支援遮罩 R-CNN。前置處理步驟僅以遮罩 R-CNN 為基礎。

針對 ONNX 模型推斷，執行下列前置處理步驟：

將影像轉換為 RGB。
調整影像大小。
將 HxWxC 變更為 CxHxW。
轉換成 float 類型。
使用 ImageNet 的 mean = [0.485, 0.456, 0.406] 和 std = [0.229, 0.224, 0.225] 進行正規化。

對於 resize_height 和 resize_width，您也可以使用在定型期間使用的值，此值受到遮罩 R-CNN 的 min_size 和max_size超參數的限制。

import glob
import numpy as np
from PIL import Image

def preprocess(image, resize_height, resize_width):
    """Perform pre-processing on raw input image
    
    :param image: raw input image
    :type image: PIL image
    :param resize_height: resize height of an input image
    :type resize_height: Int
    :param resize_width: resize width of an input image
    :type resize_width: Int
    :return: pre-processed image in numpy format
    :rtype: ndarray of shape 1xCxHxW
    """

    image = image.convert('RGB')
    image = image.resize((resize_width, resize_height))
    np_image = np.array(image)
    # HWC -> CHW
    np_image = np_image.transpose(2, 0, 1)  # CxHxW
    # normalize the image
    mean_vec = np.array([0.485, 0.456, 0.406])
    std_vec = np.array([0.229, 0.224, 0.225])
    norm_img_data = np.zeros(np_image.shape).astype('float32')
    for i in range(np_image.shape[0]):
        norm_img_data[i,:,:] = (np_image[i,:,:]/255 - mean_vec[i])/std_vec[i]
    np_image = np.expand_dims(norm_img_data, axis=0)  # 1xCxHxW
    return np_image

# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images
# use height and width based on the trained model
# use height and width based on the generated model
test_images_path = "automl_models_is/test_images_dir/*" # replace with path to images
image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
    img = Image.open(image_files[i])
    img_processed_list.append(preprocess(img, height_onnx, width_onnx))
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

使用 ONNX Runtime 進行推斷

使用 ONNX Runtime 進行推斷，對於每個電腦視覺工作有所不同。

def get_predictions_from_ONNX(onnx_session, img_data):
    """Perform predictions with ONNX runtime
    
    :param onnx_session: onnx model session
    :type onnx_session: class InferenceSession
    :param img_data: pre-processed numpy image
    :type img_data: ndarray with shape 1xCxHxW
    :return: scores with shapes
            (1, No. of classes in training dataset) 
    :rtype: numpy array
    """

    sess_input = onnx_session.get_inputs()
    sess_output = onnx_session.get_outputs()
    print(f"No. of inputs : {len(sess_input)}, No. of outputs : {len(sess_output)}")    
    # predict with ONNX Runtime
    output_names = [ output.name for output in sess_output]
    scores = onnx_session.run(output_names=output_names,\
                                               input_feed={sess_input[0].name: img_data})
    
    return scores[0]

scores = get_predictions_from_ONNX(session, img_data)

def get_predictions_from_ONNX(onnx_session,img_data):
    """Perform predictions with ONNX runtime
    
    :param onnx_session: onnx model session
    :type onnx_session: class InferenceSession
    :param img_data: pre-processed numpy image
    :type img_data: ndarray with shape 1xCxHxW
    :return: scores with shapes
            (1, No. of classes in training dataset) 
    :rtype: numpy array
    """
    
    sess_input = onnx_session.get_inputs()
    sess_output = onnx_session.get_outputs()
    print(f"No. of inputs : {len(sess_input)}, No. of outputs : {len(sess_output)}")    
    # predict with ONNX Runtime
    output_names = [ output.name for output in sess_output]
    scores = onnx_session.run(output_names=output_names,\
                                               input_feed={sess_input[0].name: img_data})
    
    return scores[0]

scores = get_predictions_from_ONNX(session, img_data)

def get_predictions_from_ONNX(onnx_session, img_data):
    """perform predictions with ONNX runtime
    
    :param onnx_session: onnx model session
    :type onnx_session: class InferenceSession
    :param img_data: pre-processed numpy image
    :type img_data: ndarray with shape 1xCxHxW
    :return: boxes, labels , scores 
            (No. of boxes, 4) (No. of boxes,) (No. of boxes,)
    :rtype: tuple
    """

    sess_input = onnx_session.get_inputs()
    sess_output = onnx_session.get_outputs()
    
    # predict with ONNX Runtime
    output_names = [output.name for output in sess_output]
    predictions = onnx_session.run(output_names=output_names,\
                                               input_feed={sess_input[0].name: img_data})

    return output_names, predictions

output_names, predictions = get_predictions_from_ONNX(session, img_data)

def get_predictions_from_ONNX(onnx_session,img_data):
    """perform predictions with ONNX Runtime
    
    :param onnx_session: onnx model session
    :type onnx_session: class InferenceSession
    :param img_data: pre-processed numpy image
    :type img_data: ndarray with shape 1xCxHxW
    :return: boxes, labels , scores 
    :rtype: list
    """
    sess_input = onnx_session.get_inputs()
    sess_output = onnx_session.get_outputs()
    # predict with ONNX Runtime
    output_names = [ output.name for output in sess_output]
    pred = onnx_session.run(output_names=output_names,\
                                               input_feed={sess_input[0].name: img_data})
    return pred[0]

result = get_predictions_from_ONNX(session, img_data)

此執行個體分割模型預測方塊、標籤、分數和遮罩。 ONNX 為每個執行個體輸出預測的遮罩，以及對應的週框方塊和類別信賴分數。必要時，您可能需要從二進位遮罩轉換成多邊形。


def get_predictions_from_ONNX(onnx_session, img_data):
    """Perform predictions with ONNX runtime
    
    :param onnx_session: onnx model session
    :type onnx_session: class InferenceSession
    :param img_data: pre-processed numpy image
    :type img_data: ndarray with shape 1xCxHxW
    :return: boxes, labels , scores , masks with shapes
            (No. of instances, 4) (No. of instances,) (No. of instances,)
            (No. of instances, 1, HEIGHT, WIDTH))  
    :rtype: tuple
    """
    
    sess_input = onnx_session.get_inputs()
    sess_output = onnx_session.get_outputs()
    # predict with ONNX Runtime
    output_names = [ output.name for output in sess_output]
    predictions = onnx_session.run(output_names=output_names,\
                                               input_feed={sess_input[0].name: img_data})
    return output_names, predictions

output_names, predictions = get_predictions_from_ONNX(session, img_data)

後置處理

在預測值上套用 softmax()，以取得每個類別的分類信賴分數 (機率)。然後，預測將會是具有最高機率的類別。

不含 PyTorch

def softmax(x):
    e_x = np.exp(x - np.max(x, axis=1, keepdims=True))
    return e_x / np.sum(e_x, axis=1, keepdims=True)

conf_scores = softmax(scores)
class_preds = np.argmax(conf_scores, axis=1)
print("predicted classes:", ([(class_idx, classes[class_idx]) for class_idx in class_preds]))

包含 PyTorch

conf_scores = torch.nn.functional.softmax(torch.from_numpy(scores), dim=1)
class_preds = torch.argmax(conf_scores, dim=1)
print("predicted classes:", ([(class_idx.item(), classes[class_idx]) for class_idx in class_preds]))

此步驟與多類別分類不同。您需要將 sigmoid 套用至 logits (ONNX 輸出)，以取得多標籤影像分類的信賴分數。

不含 PyTorch

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# we apply a threshold of 0.5 on confidence scores
score_threshold = 0.5
conf_scores = sigmoid(scores)
image_wise_preds = np.where(conf_scores > score_threshold)
for image_idx, class_idx in zip(image_wise_preds[0], image_wise_preds[1]):
    print('image: {}, class_index: {}, class_name: {}'.format(image_files[image_idx], class_idx, classes[class_idx]))

包含 PyTorch

# we apply a threshold of 0.5 on confidence scores
score_threshold = 0.5
conf_scores = torch.sigmoid(torch.from_numpy(scores))
image_wise_preds = torch.where(conf_scores > score_threshold)
for image_idx, class_idx in zip(image_wise_preds[0], image_wise_preds[1]):
    print('image: {}, class_index: {}, class_name: {}'.format(image_files[image_idx], class_idx, classes[class_idx]))

針對多類別和多標籤分類，您可以遵循先前針對 AutoML 中所有支援之演算法所述的相同步驟。

對於物件偵測，預測會自動採用 height_onnx、width_onnx 的尺規。若要將預測的方塊座標轉換成原始維度，您可以實作下列計算。

Xmin * original_width/width_onnx
Ymin * original_height/height_onnx
Xmax * original_width/width_onnx
Ymax * original_height/height_onnx

另一個選項是使用下列程式碼將方塊維度調整到 [0, 1] 的範圍內。這麼做可讓方塊座標乘以個別座標的原始影像高度和寬度 (如將預測視覺化一節所述)，以取得原始影像維度的方塊。

def _get_box_dims(image_shape, box):
    box_keys = ['topX', 'topY', 'bottomX', 'bottomY']
    height, width = image_shape[0], image_shape[1]

    box_dims = dict(zip(box_keys, [coordinate.item() for coordinate in box]))

    box_dims['topX'] = box_dims['topX'] * 1.0 / width
    box_dims['bottomX'] = box_dims['bottomX'] * 1.0 / width
    box_dims['topY'] = box_dims['topY'] * 1.0 / height
    box_dims['bottomY'] = box_dims['bottomY'] * 1.0 / height

    return box_dims

def _get_prediction(boxes, labels, scores, image_shape, classes):
    bounding_boxes = []
    for box, label_index, score in zip(boxes, labels, scores):
        box_dims = _get_box_dims(image_shape, box)

        box_record = {'box': box_dims,
                      'label': classes[label_index],
                      'score': score.item()}

        bounding_boxes.append(box_record)

    return bounding_boxes

# Filter the results with threshold.
# Please replace the threshold for your test scenario.
score_threshold = 0.8
filtered_boxes_batch = []
for batch_sample in range(0, batch_size*3, 3):
    # in case of retinanet change the order of boxes, labels, scores to boxes, scores, labels
    # confirm the same from order of boxes, labels, scores output_names 
    boxes, labels, scores = predictions[batch_sample], predictions[batch_sample + 1], predictions[batch_sample + 2]
    bounding_boxes = _get_prediction(boxes, labels, scores, (height_onnx, width_onnx), classes)
    filtered_bounding_boxes = [box for box in bounding_boxes if box['score'] >= score_threshold]
    filtered_boxes_batch.append(filtered_bounding_boxes)

下列程式碼會建立方塊、標籤和分數。使用這些週框方塊詳細資料，執行與更快速的 R CNN 模型相同的後置處理步驟。

from yolo_onnx_preprocessing_utils import non_max_suppression, _convert_to_rcnn_output

result_final = non_max_suppression(
    torch.from_numpy(result),
    conf_thres=0.1,
    iou_thres=0.5)

def _get_box_dims(image_shape, box):
    box_keys = ['topX', 'topY', 'bottomX', 'bottomY']
    height, width = image_shape[0], image_shape[1]

    box_dims = dict(zip(box_keys, [coordinate.item() for coordinate in box]))

    box_dims['topX'] = box_dims['topX'] * 1.0 / width
    box_dims['bottomX'] = box_dims['bottomX'] * 1.0 / width
    box_dims['topY'] = box_dims['topY'] * 1.0 / height
    box_dims['bottomY'] = box_dims['bottomY'] * 1.0 / height

    return box_dims

def _get_prediction(label, image_shape, classes):
    
    boxes = np.array(label["boxes"])
    labels = np.array(label["labels"])
    labels = [label[0] for label in labels]
    scores = np.array(label["scores"])
    scores = [score[0] for score in scores]

    bounding_boxes = []
    for box, label_index, score in zip(boxes, labels, scores):
        box_dims = _get_box_dims(image_shape, box)

        box_record = {'box': box_dims,
                      'label': classes[label_index],
                      'score': score.item()}

        bounding_boxes.append(box_record)

    return bounding_boxes

bounding_boxes_batch = []
for result_i, pad in zip(result_final, pad_list):
    label, image_shape = _convert_to_rcnn_output(result_i, height_onnx, width_onnx, pad)
    bounding_boxes_batch.append(_get_prediction(label, image_shape, classes))
print(json.dumps(bounding_boxes_batch, indent=1))

將預測視覺化

使用標籤將輸入影像視覺化

import matplotlib.image as mpimg
import matplotlib.pyplot as plt
%matplotlib inline

sample_image_index = 0 # change this for an image of interest from image_files list
IMAGE_SIZE = (18, 12)
plt.figure(figsize=IMAGE_SIZE)
img_np = mpimg.imread(image_files[sample_image_index])

img = Image.fromarray(img_np.astype('uint8'), 'RGB')
x, y = img.size

fig,ax = plt.subplots(1, figsize=(15, 15))
# Display the image
ax.imshow(img_np)

label = class_preds[sample_image_index]
if torch.is_tensor(label):
    label = label.item()
    
conf_score = conf_scores[sample_image_index]
if torch.is_tensor(conf_score):
    conf_score = np.max(conf_score.tolist())
else:
    conf_score = np.max(conf_score)

display_text = '{} ({})'.format(label, round(conf_score, 3))
print(display_text)

color = 'red'
plt.text(30, 30, display_text, color=color, fontsize=30)

plt.show()

使用標籤將輸入影像視覺化

import matplotlib.image as mpimg
import matplotlib.pyplot as plt
%matplotlib inline

sample_image_index = 0 # change this for an image of interest from image_files list
IMAGE_SIZE = (18, 12)
plt.figure(figsize=IMAGE_SIZE)
img_np = mpimg.imread(image_files[sample_image_index])
img = Image.fromarray(img_np.astype('uint8'), 'RGB')
x, y = img.size

fig,ax = plt.subplots(1, figsize=(15, 15))
# Display the image
ax.imshow(img_np)
# we apply a threshold of 0.5 on confidence scores
score_threshold = 0.5
label_offset_x = 30
label_offset_y = 30
if torch.is_tensor(conf_scores):
    sample_image_scores = conf_scores[sample_image_index].tolist()
else:
    sample_image_scores = conf_scores[sample_image_index]
    
for index, score in enumerate(sample_image_scores):
    if score > score_threshold:
        label = classes[index]
        display_text = '{} ({})'.format(label, round(score, 3))
        print(display_text)

        color = 'red'
        plt.text(label_offset_x, label_offset_y, display_text, color=color, fontsize=30)
        label_offset_y += 30

plt.show()

使用方塊和標籤將輸入影像視覺化

import matplotlib.image as mpimg
import matplotlib.patches as patches
import matplotlib.pyplot as plt
%matplotlib inline

img_np = mpimg.imread(image_files[1])  # replace with desired image index
image_boxes = filtered_boxes_batch[1]  # replace with desired image index

IMAGE_SIZE = (18, 12)
plt.figure(figsize=IMAGE_SIZE)
img = Image.fromarray(img_np.astype('uint8'), 'RGB')
x, y = img.size
print(img.size)

fig,ax = plt.subplots(1)
# Display the image
ax.imshow(img_np)

# Draw box and label for each detection 
for detect in image_boxes:
    label = detect['label']
    box = detect['box']
    ymin, xmin, ymax, xmax =  box['topY'], box['topX'], box['bottomY'], box['bottomX']
    topleft_x, topleft_y = x * xmin, y * ymin
    width, height = x * (xmax - xmin), y * (ymax - ymin)
    print('{}: {}, {}, {}, {}'.format(detect['label'], topleft_x, topleft_y, width, height))
    rect = patches.Rectangle((topleft_x, topleft_y), width, height, 
                             linewidth=1, edgecolor='green', facecolor='none')

    ax.add_patch(rect)
    color = 'green'
    plt.text(topleft_x, topleft_y, label, color=color)

plt.show()

使用方塊和標籤將輸入影像視覺化

import matplotlib.image as mpimg
import matplotlib.patches as patches
import matplotlib.pyplot as plt
%matplotlib inline

img_np = mpimg.imread(image_files[1])  # replace with desired image index
image_boxes = bounding_boxes_batch[1]  # replace with desired image index

IMAGE_SIZE = (18, 12)
plt.figure(figsize=IMAGE_SIZE)
img = Image.fromarray(img_np.astype('uint8'), 'RGB')
x, y = img.size
print(img.size)

fig,ax = plt.subplots(1)
# Display the image
ax.imshow(img_np)

# Draw box and label for each detection 
for detect in image_boxes:
    label = detect['label']
    box = detect['box']
    ymin, xmin, ymax, xmax =  box['topY'], box['topX'], box['bottomY'], box['bottomX']
    topleft_x, topleft_y = x * xmin, y * ymin
    width, height = x * (xmax - xmin), y * (ymax - ymin)
    print('{}: {}, {}, {}, {}'.format(detect['label'], topleft_x, topleft_y, width, height))
    rect = patches.Rectangle((topleft_x, topleft_y), width, height, 
                             linewidth=1, edgecolor='green', facecolor='none')

    ax.add_patch(rect)
    color = 'green'
    plt.text(topleft_x, topleft_y, label, color=color)

plt.show()

使用遮罩和標籤將範例輸入影像視覺化

import matplotlib.patches as patches
import matplotlib.pyplot as plt
%matplotlib inline

def display_detections(image, boxes, labels, scores, masks, resize_height, 
                       resize_width, classes, score_threshold):
    """Visualize boxes and masks
    
    :param image: raw image
    :type image: PIL image
    :param boxes: box with shape (No. of instances, 4) 
    :type boxes: ndarray 
    :param labels: classes with shape (No. of instances,) 
    :type labels: ndarray
    :param scores: scores with shape (No. of instances,)
    :type scores: ndarray
    :param masks: masks with shape (No. of instances, 1, HEIGHT, WIDTH) 
    :type masks:  ndarray
    :param resize_height: expected height of an input image in onnx model
    :type resize_height: Int
    :param resize_width: expected width of an input image in onnx model
    :type resize_width: Int
    :param classes: classes with shape (No. of classes) 
    :type classes:  list
    :param score_threshold: threshold on scores in the range of 0-1
    :type score_threshold: float
    :return: None
    """

    _, ax = plt.subplots(1, figsize=(12,9))

    image = np.array(image)
    original_height = image.shape[0]
    original_width = image.shape[1]

    for mask, box, label, score in zip(masks, boxes, labels, scores):        
        if score <= score_threshold:
            continue
        mask = mask[0, :, :, None]        
        # resize boxes to original raw input size
        box = [box[0]*original_width/resize_width, 
               box[1]*original_height/resize_height, 
               box[2]*original_width/resize_width, 
               box[3]*original_height/resize_height]
        
        mask = cv2.resize(mask, (image.shape[1], image.shape[0]), 0, 0, interpolation = cv2.INTER_NEAREST)
        # mask is a matrix with values in the range of [0,1]
        # higher values indicate presence of object and vice versa
        # select threshold or cut-off value to get objects present       
        mask = mask > score_threshold
        image_masked = image.copy()
        image_masked[mask] = (0, 255, 255)
        alpha = 0.5  # alpha blending with range 0 to 1
        cv2.addWeighted(image_masked, alpha, image, 1 - alpha,0, image)
        rect = patches.Rectangle((box[0], box[1]), box[2] - box[0], box[3] - box[1],\
                                 linewidth=1, edgecolor='b', facecolor='none')
        ax.annotate(classes[label] + ':' + str(np.round(score, 2)), (box[0], box[1]),\
                    color='w', fontsize=12)
        ax.add_patch(rect)
        
    ax.imshow(image)
    plt.show()

score_threshold = 0.5
img = Image.open(image_files[1])  # replace with desired image index
image_boxes = filtered_boxes_batch[1]  # replace with desired image index
boxes, labels, scores, masks = predictions[4:8]  # replace with desired image index
display_detections(img, boxes.copy(), labels, scores, masks.copy(), 
                   height_onnx, width_onnx, classes, score_threshold)

後續步驟

意見反應

此頁面對您有幫助嗎？

Last updated on 2025-10-30

共用方式為

從 AutoML 使用 ONNX 在電腦視覺模型上進行預測 (v1)

先決條件

下載 ONNX 模型檔案

Azure Machine Learning 工作室

Azure Machine Learning Python SDK

產生模型以進行批次評分

載入標籤和 ONNX 模型檔案

取得 ONNX 模型的預期輸入和輸出詳細資料

ONNX 模型的預期輸入和輸出格式

輸入格式

輸出格式

前置處理

不含 PyTorch

包含 PyTorch

使用 ONNX Runtime 進行推斷

後置處理

不含 PyTorch

包含 PyTorch

將預測視覺化

後續步驟

意見反應

其他資源