コンテンツにスキップ

Plugin EP libraries

プラグイン実行プロバイダーライブラリ

Section titled “プラグイン実行プロバイダーライブラリ”

ONNX Runtime実行プロバイダー(EP)は、1つまたは複数のハードウェアアクセラレーター(GPU、NPUなど)上でモデル操作を実行します。ONNX Runtimeは、デフォルトのCPU EPなど、様々な組み込みEPを提供しています。さらなる拡張性を可能にするため、ONNX Runtimeは、アプリケーションがONNX Runtime推論セッションで使用するためにONNX Runtimeに登録できるユーザー定義のプラグインEPライブラリをサポートしています。

このページでは、ONNX RuntimeでプラグインEPライブラリを開発・使用するために必要なAPIのリファレンスを提供します。

プラグインEPは、CreateEpFactories()ReleaseEpFactory()関数をエクスポートする動的/共有ライブラリとしてビルドされます。ONNX RuntimeはCreateEpFactories()を呼び出して、1つまたは複数のOrtEpFactoryインスタンスを取得します。OrtEpFactoryOrtEpインスタンスを作成し、作成するEPがサポートするハードウェアデバイスを指定します。

ONNX RuntimeリポジトリにはサンプルプラグインEPライブラリが含まれており、以下のセクションで参照されています。

OrtEpは、ONNX Runtimeセッションがサポートするモデル操作を識別・実行するために使用されるEPのインスタンスを表します。

以下の表は、実装者がOrtEpに対して定義しなければならない必須の変数と関数を示しています。

フィールド 概要 実装例
ort_version_supported EPがコンパイルされたONNX Runtimeのバージョン。実装ではORT_API_VERSIONに設定する必要があります。 ExampleEp()
GetName 実行プロバイダー名を取得します。 ExampleEp::GetNameImpl()
GetCapability OrtEpインスタンスがサポートするノード/サブグラフに関する情報を取得します。 ExampleEp::GetCapabilityImpl()
Compile OrtEpに割り当てられたOrtGraphインスタンスをコンパイルします。実装では、計算関数を定義するために各OrtGraphに対してOrtNodeComputeInfoインスタンスを設定する必要があります。

セッションが事前コンパイル済みモデルを生成するように設定されている場合、実行プロバイダーはcount数のEPContextノードを返す必要があります。
ExampleEp::CompileImpl()
ReleaseNodeComputeInfos OrtNodeComputeInfoインスタンスを解放します。 ExampleEp::ReleaseNodeComputeInfosImpl()

以下の表は、実装者がOrtEpに対して定義できるオプションの関数を示しています。オプションのOrtEp関数が定義されていない場合、ONNX Runtimeはデフォルトの実装を使用します。

Field Summary Example implementation
GetPreferredDataLayout Get the EP's preferred data layout.

If this function is not implemented, ORT assumes that the EP prefers the data layout OrtEpDataLayout::NCHW.
ShouldConvertDataLayoutForOp Given an op with domain domain and type op_type, determine whether an associated node's data layout should be converted to a target_data_layout. If the EP prefers a non-default data layout, this function will be called during layout transformation with target_data_layout set to the EP's preferred data layout

Implementation of this function is optional. If an EP prefers a non-default data layout, it may implement this to customize the specific op data layout preferences at a finer granularity.
SetDynamicOptions Set dynamic options on this EP. Dynamic options can be set by the application at any time after session creation with OrtApi::SetEpDynamicOptions().

Implementation of this function is optional. An EP should only implement this function if it needs to handle any dynamic options.
OnRunStart Called by ORT to notify the EP of the start of a run.

Implementation of this function is optional. An EP should only implement this function if it needs to handle application-provided options at the start of a run.
OnRunEnd Called by ORT to notify the EP of the end of a run.

Implementation of this function is optional. An EP should only implement this function if it needs to handle application-provided options at the end of a run.
CreateAllocator Create an OrtAllocator for the given OrtMemoryInfo for an OrtSession.

The OrtMemoryInfo instance will match one of the values set in the OrtEpDevice using EpDevice_AddAllocatorInfo. Any allocator specific options should be read from the session options.

Implementation of this function is optional. If not provided, ORT will use `OrtEpFactory::CreateAllocator()`.
CreateSyncStreamForDevice Create a synchronization stream for the given memory device for an OrtSession.

This is used to create a synchronization stream for the execution provider and is used to synchronize operations on the device during model execution. Any stream specific options should be read from the session options.

Implementation of this function is optional. If not provided, ORT will use `OrtEpFactory::CreateSyncStreamForDevice()`.
GetCompiledModelCompatibilityInfo Get a string with details about the EP stack used to produce a compiled model.

The compatibility information string can be used with OrtEpFactory::ValidateCompiledModelCompatibilityInfo to determine if a compiled model is compatible with the EP.

An OrtEpFactory represents an instance of an EP factory that is used by an ONNX Runtime session to query device support, create allocators, create data transfer objects, and create instances of an EP (i.e., an OrtEp).

The following table lists the required variables and functions that an implementer must define for an OrtEpFactory.

Field Summary Example implementation
ort_version_supported The ONNX Runtime version with which the EP was compiled. Implementation should set this to ORT_API_VERSION. ExampleEpFactory()
GetName Get the name of the EP that the factory creates. Must match OrtEp::GetName(). ExampleEpFactory::GetNameImpl()
GetVendor Get the name of the name of the vendor that owns the EP that the factory creates. ExampleEpFactory::GetVendor()
GetVendorId Get the vendor ID of the vendor that owns the EP that the factory creates. This is typically the PCI vendor ID. ExampleEpFactory::GetVendorId()
GetVersion Get the version of the EP that the factory creates. The version string should adhere to the Semantic Versioning 2.0 specification. ExampleEpFactory::GetVersionImpl()
GetSupportedDevices Get information about the OrtHardwareDevice instances supported by an EP created by the factory. ExampleEpFactory::GetSupportedDevicesImpl()
CreateEp Creates an OrtEp instance for use in an ONNX Runtime session. ORT calls OrtEpFactory::ReleaseEp() to release the instance. ExampleEpFactory::CreateEpImpl()

The following table lists the optional functions that an implementer may define for an OrtEpFactory.

Field Summary Example implementation
ValidateCompiledModelCompatibilityInfo Validate the compatibility of a compiled model with the EP.

This function validates if a model produced with the supllied compatibility information string is supported by the underlying EP. The implementation should check if a compiled model is compatible with the EP and return the appropriate OrtCompiledModelCompatibility value.
CreateAllocator Create an OrtAllocator that can be shared across sessions for the given OrtMemoryInfo.

The factory that creates the EP is responsible for providing the allocators required by the EP. The OrtMemoryInfo instance will match one of the values set in the OrtEpDevice using EpDevice_AddAllocatorInfo.
ExampleEpFactory::CreateAllocatorImpl()
ReleaseAllocator Releases an OrtAllocator instance created by the factory. ExampleEpFactory::ReleaseAllocatorImpl()
CreateDataTransfer Creates an OrtDataTransferImpl instance for the factory.

An OrtDataTransferImpl can be used to copy data between devices that the EP supports.
ExampleEpFactory::CreateDataTransferImpl()
IsStreamAware Returns true if the EPs created by the factory are stream-aware. ExampleEpFactory::IsStreamAwareImpl()
CreateSyncStreamForDevice Creates a synchronization stream for the given OrtMemoryDevice.

This is use to create a synchronization stream for the OrtMemoryDevice that can be used for operations outside of a session.
ExampleEpFactory::CreateSyncStreamForDeviceImpl()

Exporting functions to create and release factories

Section titled “Exporting functions to create and release factories”

ONNX Runtime expects a plugin EP library to export certain functions/symbols. The following table lists the functions that have to be exported from the plugin EP library.

Function Description Example implementation
CreateEpFactories ONNX Runtime calls this function to create OrtEpFactory instances. ExampleEp: CreateEpFactories
ReleaseEpFactory ONNX Runtime calls this function to release an OrtEpFactory instance. ExampleEp: ReleaseEpFactory

The sample application code below uses the following API functions to register and unregister a plugin EP library.

const char* lib_registration_name = "ep_lib_name";
Ort::Env env;
// Register plugin EP library with ONNX Runtime.
env.RegisterExecutionProviderLibrary(
lib_registration_name, // Registration name can be anything the application chooses.
ORT_TSTR("ep_path.dll") // Path to the plugin EP library.
);
{
Ort::Session session(env, /*...*/);
// Run a model ...
}
// Unregister the library using the application-specified registration name.
// Must only unregister a library after all sessions that use the library have been released.
env.UnregisterExecutionProviderLibrary(lib_registration_name);

As shown in the following sequence diagram, registering a plugin EP library causes ONNX Runtime to load the library and call the library’s CreateEpFactories() function. During the call to CreateEpFactories(), ONNX Runtime determines the subset of hardware devices supported by each factory by calling OrtEpFactory::GetSupportedDevices() with all hardware devices that ONNX Runtime discovered during initialization.

The factory returns OrtEpDevice instances from OrtEpFactory::GetSupportedDevices(). Each OrtEpDevice instance pairs a factory with a hardware device that the factory supports. For example, if a single factory instance supports both CPU and NPU, then the call to OrtEpFactory::GetSupportedDevices() returns two OrtEpDevice instances:

  • ep_device_0: (factory_0, CPU)
  • ep_device_1: (factory_0, NPU)

Sequence diagram showing registration and unregistration of a plugin EP library

Session creation with explicit OrtEpDevice(s)

Section titled “Session creation with explicit OrtEpDevice(s)”

The application code below uses the API function SessionOptionsAppendExecutionProvider_V2 to add an EP from a library to an ONNX Runtime session.

The application first calls GetEpDevices to get a list of OrtEpDevices available to the application. Each OrtEpDevice represents a hardware device supported by an OrtEpFactory. The SessionOptionsAppendExecutionProvider_V2 function takes an array of OrtEpDevice instances as input, where all OrtEpDevice instances refer to the same OrtEpFactory.

Ort::Env env;
env.RegisterExecutionProviderLibrary(/*...*/);
{
std::vector<Ort::ConstEpDevice> ep_devices = env.GetEpDevices();
// Find the Ort::EpDevice for "my_ep".
std::array<Ort::ConstEpDevice, 1> selected_ep_devices = { nullptr };
for (Ort::ConstEpDevice ep_device : ep_devices) {
if (std::strcmp(ep_device.GetName(), "my_ep") == 0) {
selected_ep_devices[0] = ep_device;
break;
}
}
if (selected_ep_devices[0] == nullptr) {
// Did not find EP. Report application error ...
}
Ort::KeyValuePairs ep_options(/*...*/); // Optional EP options.
Ort::SessionOptions session_options;
session_options.AppendExecutionProvider_V2(env, selected_ep_devices, ep_options);
Ort::Session session(env, ORT_TSTR("model.onnx"), session_options);
// Run model ...
}
env.UnregisterExecutionProviderLibrary(/*...*/);

As shown in the following sequence diagram, ONNX Runtime calls OrtEpFactory::CreateEp() during session creation in order to create an instance of the plugin EP.


Sequence diagram showing session creation with explicit ep devices

Session creation with automatic EP selection

Section titled “Session creation with automatic EP selection”

The application code below uses the API function SessionOptionsSetEpSelectionPolicy to have ONNX Runtime automatically select an EP based on the user’s policy (e.g., PREFER_NPU). If the plugin EP library registered with ONNX Runtime has a factory that supports NPU, then ONNX Runtime may select an EP from that factory to run the model.

Ort::Env env;
env.RegisterExecutionProviderLibrary(/*...*/);
{
Ort::SessionOptions session_options;
session_options.SetEpSelectionPolicy(OrtExecutionProviderDevicePolicy::PREFER_NPU);
Ort::Session session(env, ORT_TSTR("model.onnx"), session_options);
// Run model ...
}
env.UnregisterExecutionProviderLibrary(/*...*/);

Sequence diagram showing session creation with automatic EP selection

API header files:

  • onnxruntime_ep_c_api.h
    • Defines interfaces implemented by plugin EP and EP factory instances.
    • Provides APIs utilized by plugin EP and EP factory instances.
  • onnxruntime_c_api.h
    • Provides APIs used to traverse an input model graph.
Type Description
OrtHardwareDeviceType Enumerates classes of hardware devices:
  • OrtHardwareDeviceType_CPU
  • OrtHardwareDeviceType_GPU
  • OrtHardwareDeviceType_NPU
OrtHardwareDevice Opaque type that represents a physical hardware device.
OrtExecutionProviderDevicePolicy Enumerates the default EP selection policies available to users of ORT's automatic EP selection.
OrtEpDevice Opaque type that represents a pairing of an EP and hardware device that can run a model or model subgraph.
OrtNodeFusionOptions Struct that contains options for fusing nodes supported by an EP.
OrtNodeComputeContext Opaque type that contains a compiled/fused node's name and host memory allocation functions. ONNX Runtime provides an instance of OrtNodeComputeContext as an argument to OrtNodeComputeInfo::CreateState().
OrtNodeComputeInfo Struct that contains the computation function for a compiled OrtGraph instance. Initialized by an OrtEp instance.
OrtEpGraphSupportInfo Opaque type that contains information on the nodes supported by an EP. An instance of OrtEpGraphSupportInfo is passed to OrtEp::GetCapability() and the EP populates the OrtEpGraphSupportInfo instance with information on the nodes that it supports.
OrtEpDataLayout Enumerates the operator data layouts that could be preferred by an EP. By default, ONNX models use a "channel-first" layout (e.g., NCHW) but some EPs may prefer a "channel-last" layout (e.g., NHWC).
OrtMemoryDevice Opaque type that represents a combination of a physical device and memory type. A memory allocation and allocator are associated with a specific OrtMemoryDevice, and this information is used to determine when data transfer is required.
OrtDataTransferImpl Struct of functions that an EP implements to copy data between the devices that the EP uses and CPU.
OrtSyncNotificationImpl Struct of functions that an EP implements for Stream notifications.
OrtSyncStreamImpl Struct of functions that an EP implements if it needs to support Streams.
OrtEpFactory A plugin EP library provides ORT with one or more instances of OrtEpFactory. An OrtEpFactory implements functions that are used by ORT to query device support, create allocators, create data transfer objects, and create instances of an EP (i.e., an OrtEp instance).

An OrtEpFactory may support more than one hardware device (OrtHardwareDevice). If more than one hardware device is supported by the factory, an EP instance created by the factory is expected to internally partition any graph nodes assigned to the EP among its supported hardware devices.

Alternatively, if an EP library author needs ONNX Runtime to partition the graph nodes among different hardware devices supported by the EP library, then the EP library must provide multiple OrtEpFactory instances. Each OrtEpFactory instance must support one hardware device and must create an EP instance with a unique name (e.g., MyEP_CPU, MyEP_GPU, MyEP_NPU).

OrtEp An instance of an Ep that can execute model nodes on one or more hardware devices (OrtHardwareDevice). An OrtEp implements functions that are used by ORT to query graph node support, compile supported nodes, query preferred data layout, set run options, etc. An OrtEpFactory creates an OrtEp instance via the OrtEpFactory::CreateEp() function.
OrtRunOptions Opaque object containing options passed to the OrtApi::Run() function, which runs a model.
OrtGraph Opaque type that represents a graph. Provided to OrtEp instances in calls to OrtEp::GetCapability() and OrtEp::Compile().
OrtValueInfo Opaque type that contains information for a value in a graph. A graph value can be a graph input, graph output, graph initializer, node input, or node output. An OrtValueInfo instance has the following information.
  • Type and shape (e.g., OrtTypeInfo)
  • OrtNode consumers
  • OrtNode producer
  • Information that classifies the value as a graph input, graph output, initializer, etc.
OrtExternalInitializerInfo Opaque type that contains information for an initializer stored in an external file. An OrtExternalInitializerInfo instance contains the file path, file offset, and byte size for the initializer. Can be obtained from an OrtValueInfo via the function ValueInfo_GetExternalInitializerInfo().
OrtTypeInfo Opaque type that contains the element type and shape information for ONNX tensors, sequences, maps, sparse tensors, etc.
OrtTensorTypeAndShapeInfo Opaque type that contains the element type and shape information for an ONNX tensor.
OrtNode Opaque type that represents a node in a graph.
OrtOpAttrType Enumerates attribute types.
OrtOpAttr Opaque type that represents an ONNX operator attribute.

The following table lists the API functions used for registration of a plugin EP library.

Function Description
RegisterExecutionProviderLibrary Register an EP library with ORT. The library must export the CreateEpFactories and ReleaseEpFactory functions.
UnregisterExecutionProviderLibrary Unregister an EP library with ORT. Caller MUST ensure there are no OrtSession instances using the EPs created by the library before calling this function.
GetEpDevices Get the list of available OrtEpDevice instances.

Each OrtEpDevice instance contains details of the execution provider and the device it will use.
SessionOptionsAppendExecutionProvider_V2 Append the execution provider that is responsible for the provided OrtEpDevice instances to the session options.
SessionOptionsSetEpSelectionPolicy Set the execution provider selection policy for the session.

Allows users to specify a device selection policy for automatic EP selection. If custom selection is required please use SessionOptionsSetEpSelectionPolicyDelegate instead.
SessionOptionsSetEpSelectionPolicyDelegate Set the execution provider selection policy delegate for the session.

Allows users to provide a custom device selection policy for automatic EP selection.