Python

PythonでのONNX Runtime入門

以下は、モデルのシリアル化とORTによる推論にONNXを使用するために、パッケージをインストールするためのクイックガイドです。

ONNX Runtimeのインストール

ONNX Runtimeには2つのPythonパッケージがあります。いずれの環境でも、一度にインストールできるのはこれらのパッケージの1つだけです。GPUパッケージは、CPU機能のほとんどを網羅しています。

ONNX Runtime CPUのインストール

Arm®ベースのCPUやmacOSで実行している場合は、CPUパッケージを使用してください。

pip install onnxruntime

ONNX Runtime GPUのインストール (CUDA 12.x)

ORTのデフォルトのCUDAバージョンは12.xです。

pip install onnxruntime-gpu

ONNX Runtime GPUのインストール (CUDA 11.8)

Cuda 11.8の場合は、次の手順に従ってORT Azure Devops Feedからインストールしてください。

pip install onnxruntime-gpu --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-11/pypi/simple/

モデルエクスポート用のONNXのインストール

## ONNXはPyTorchに組み込まれています
pip install torch

## tensorflow
pip install tf2onnx

## sklearn
pip install skl2onnx

PyTorch、TensorFlow、SciKit Learnのクイックスタート例

お気に入りのフレームワークを使用してモデルをトレーニングし、ONNX形式にエクスポートして、サポートされているONNX Runtime言語で推論します。

PyTorch CV

この例では、PyTorch CVモデルをONNX形式にエクスポートし、ORTで推論する方法について説明します。モデルを作成するコードは、Microsoft LearnのPyTorch Fundamentals学習パスからのものです。

torch.onnx.exportを使用してモデルをエクスポートします

torch.onnx.export(model,                                # 実行中のモデル
                  torch.randn(1, 28, 28).to(device),    # モデル入力（複数の入力の場合はタプル）
                  "fashion_mnist_model.onnx",           # モデルの保存先（ファイルまたはファイルのようなオブジェクト）
                  input_names = ['input'],              # モデルの入力名
                  output_names = ['output'])            # モデルの出力名

onnx.loadでonnxモデルをロードします

import onnx
onnx_model = onnx.load("fashion_mnist_model.onnx")
onnx.checker.check_model(onnx_model)

ort.InferenceSessionを使用して推論セッションを作成します

import onnxruntime as ort
import numpy as np
x, y = test_data[0][0], test_data[0][1]
ort_sess = ort.InferenceSession('fashion_mnist_model.onnx')
outputs = ort_sess.run(None, {'input': x.numpy()})

# 結果の出力
predicted, actual = classes[outputs[0][0].argmax(0)], classes[y]
print(f'Predicted: "{predicted}", Actual: "{actual}"')

PyTorch NLP

この例では、PyTorch NLPモデルをONNX形式にエクスポートし、ORTで推論する方法について説明します。AG Newsモデルを作成するコードは、このPyTorchチュートリアルからのものです。

テキストを処理し、エクスポート用のサンプルデータ入力とオフセットを作成します。

import torch
text = "Text from the news article"
text = torch.tensor(text_pipeline(text))
offsets = torch.tensor([0])

モデルのエクスポート

# モデルをエクスポート
torch.onnx.export(model,                     # 実行中のモデル
                  (text, offsets),           # モデル入力（複数の入力の場合はタプル）
                  "ag_news_model.onnx",      # モデルの保存先（ファイルまたはファイルのようなオブジェクト）
                  export_params=True,        # トレーニング済みのパラメータの重みをモデルファイル内に保存
                  opset_version=10,          # モデルをエクスポートするONNXバージョン
                  do_constant_folding=True,  # 最適化のために定数畳み込みを実行するかどうか
                  input_names = ['input', 'offsets'],   # モデルの入力名
                  output_names = ['output'], # モデルの出力名
                  dynamic_axes={'input' : {0 : 'batch_size'},    # 可変長の軸
                                'output' : {0 : 'batch_size'}})

onnx.loadを使用してモデルをロードします

import onnx
onnx_model = onnx.load("ag_news_model.onnx")
onnx.checker.check_model(onnx_model)

ort.InferenceSessionで推論セッションを作成します

import onnxruntime as ort
import numpy as np
ort_sess = ort.InferenceSession('ag_news_model.onnx')
outputs = ort_sess.run(None, {'input': text.numpy(),
                              'offsets':  torch.tensor([0]).numpy()})
# 結果の出力
result = outputs[0].argmax(axis=1)+1
print("This is a %s news" %ag_news_label[result[0]])

TensorFlow CV

この例では、TensorFlow CVモデルをONNX形式にエクスポートし、ORTで推論する方法について説明します。使用するモデルは、Keras resnet50用のこのGitHubノートブックからのものです。

事前トレーニング済みモデルを取得します

import os
import tensorflow as tf
from tensorflow.keras.applications.resnet50 import ResNet50
import onnxruntime

model = ResNet50(weights='imagenet')

preds = model.predict(x)
print('Keras Predicted:', decode_predictions(preds, top=3)[0])
model.save(os.path.join("/tmp", model.name))

モデルをonnxに変換してエクスポートします

import tf2onnx
import onnxruntime as rt

spec = (tf.TensorSpec((None, 224, 224, 3), tf.float32, name="input"),)
output_path = model.name + ".onnx"

model_proto, _ = tf2onnx.convert.from_keras(model, input_signature=spec, opset=13, output_path=output_path)
output_names = [n.name for n in model_proto.graph.output]

rt.InferenceSessionで推論セッションを作成します

providers = ['CPUExecutionProvider']
m = rt.InferenceSession(output_path, providers=providers)
onnx_pred = m.run(output_names, {"input": x})

print('ONNX Predicted:', decode_predictions(onnx_pred[0], top=3)[0])

SciKit Learn CV

この例では、SciKit Learn CVモデルをONNX形式にエクスポートし、ORTで推論する方法について説明します。有名なirisデータセットを使用します。

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y)

from sklearn.linear_model import LogisticRegression
clr = LogisticRegression()
clr.fit(X_train, y_train)
print(clr)

LogisticRegression()

モデルをONNX形式に変換またはエクスポートします

from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType

initial_type = [('float_input', FloatTensorType([None, 4]))]
onx = convert_sklearn(clr, initial_types=initial_type)
with open("logreg_iris.onnx", "wb") as f:
    f.write(onx.SerializeToString())

ONNX Runtimeを使用してモデルをロードして実行しますこの機械学習モデルの予測を計算するためにONNX Runtimeを使用します。

import numpy
import onnxruntime as rt

sess = rt.InferenceSession("logreg_iris.onnx")
input_name = sess.get_inputs()[0].name
pred_onx = sess.run(None, {input_name: X_test.astype(numpy.float32)})[0]
print(pred_onx)

OUTPUT:
 [0 1 0 0 1 2 2 0 0 2 1 0 2 2 1 1 2 2 2 0 2 2 1 2 1 1 1 0 2 1 1 1 1 0 1 0 0
  1]

予測されたクラスを取得します

コードを変更して、リストに名前を指定することで特定の出力を1つ取得できます。

import numpy
import onnxruntime as rt

sess = rt.InferenceSession("logreg_iris.onnx")
input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name
pred_onx = sess.run(
    [label_name], {input_name: X_test.astype(numpy.float32)})[0]
print(pred_onx)

Python APIリファレンスドキュメント

ORT Python APIドキュメントに移動

ビルド

pipを使用している場合は、ダウンロードする前にpip install --upgrade pipを実行してください。

アーティファクト	説明	サポートされているプラットフォーム
onnxruntime	CPU (リリース)	Windows (x64), Linux (x64, ARM64), Mac (X64),
nightly	CPU (開発版)	上記と同じ
onnxruntime-gpu	GPU (リリース)	Windows (x64), Linux (x64, ARM64)
onnxruntime-gpu for CUDA 11.*	GPU (開発版)	Windows (x64), Linux (x64, ARM64)
onnxruntime-gpu for CUDA 12.*	GPU (開発版)	Windows (x64), Linux (x64, ARM64)

CUDA 11.*用のonnxruntime-gpuをインストールする例:

python -m pip install onnxruntime-gpu --extra-index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ort-cuda-11-nightly/pypi/simple/

CUDA 12.*用のonnxruntime-gpuをインストールする例:

python -m pip install onnxruntime-gpu --pre --extra-index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/

Pythonコンパイラのバージョンに関する注意については、このページを参照してください。

さらに詳しく

Pythonチュートリアル