Skip to content

ORT 1.26.1 Cherry Picks#28510

Closed
adrastogi wants to merge 1 commit into
rel-1.26.1from
adrastogi/1.26.1-cherry-pick
Closed

ORT 1.26.1 Cherry Picks#28510
adrastogi wants to merge 1 commit into
rel-1.26.1from
adrastogi/1.26.1-cherry-pick

Conversation

@adrastogi
Copy link
Copy Markdown
Contributor

This cherry-picks the following commits for the release:

#28396 Fix loader-refcount leak in ProviderLibrary::Load on GetProvider miss

…#28396)

### Description

Fixes a loader-refcount leak in `ProviderLibrary::Load()` that affects
any caller of `OrtApi::RegisterExecutionProviderLibrary` whose target
DLL does not export `GetProvider` (i.e. plugin EPs that are not legacy
provider-bridge libraries).

In `onnxruntime/core/session/utils.cc`,
[`LoadPluginOrProviderBridge`](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/session/utils.cc#L852)
probes whether the EP DLL is a "provider bridge" by constructing a
temporary `ProviderLibrary` and calling `Load()`:

```cpp
auto provider_library = std::make_unique<ProviderLibrary>(...);
bool is_provider_bridge = provider_library->Load() == Status::OK();   // (+1) loader ref
...
auto ep_library_plugin = std::make_unique<EpLibraryPlugin>(...);
ORT_RETURN_IF_ERROR(ep_library_plugin->Load());                        // (+1) loader ref
```

`ProviderLibrary::Load()` calls `LoadDynamicLibrary` (handle_=1), then
`GetSymbolFromLibrary("GetProvider", ...)`. For a plugin EP DLL that
does not export `GetProvider`, the lookup fails and
`ORT_RETURN_IF_ERROR` returns the `Status` early — without calling
`Unload()`. The surrounding `try/catch` only catches `std::exception`,
so the early return is not covered. `~ProviderLibrary()` is empty
(intentionally, to avoid static-destruction-order issues with the global
`s_library_*` instances), so the temporary's `handle_` is silently
leaked.

`EpLibraryPlugin::Load()` then maps the same DLL again (refcount → 2).
Later, `UnregisterExecutionProviderLibrary` releases only the reference
owned by `EpLibraryPlugin`, leaving the EP DLL mapped (refcount = 1)
after the caller explicitly unregistered it. Static / thread-local
destructors that need the EP DLL outlive `Unregister` and run at process
exit in an inconsistent order — on Windows this often manifests as a
`STATUS_STACK_BUFFER_OVERRUN` (`0xC0000409`) fast-fail during teardown.

### Fix

In `ProviderLibrary::Load()`, when `GetSymbolFromLibrary("GetProvider",
...)` fails, call `Unload()` before returning the error:

```cpp
auto get_provider_status = Env::Default().GetSymbolFromLibrary(handle_, "GetProvider", (void**)&PGetProvider);
if (!get_provider_status.IsOK()) {
  Unload();
  return get_provider_status;
}
provider_ = PGetProvider();
```

This keeps the existing `~ProviderLibrary()` contract (callers are still
expected to call `Unload()` themselves on success paths), but plugs the
early-return-after-`LoadDynamicLibrary` leak.

### Verification

Standalone repro that does only `LoadLibraryExW("onnxruntime.dll")` →
`RegisterExecutionProviderLibrary("VitisAIExecutionProvider",
L"onnxruntime_vitisai_ep.dll")` → `UnregisterExecutionProviderLibrary` →
`ReleaseEnv`, then drains `FreeLibrary` on the EP DLL counting
iterations:

https://github.com/BoarQing/ep_double_load_repro

| build | output |
|---|---|
| `main` (unpatched) | `ep dll leaked refcount after
Unregister+ReleaseEnv = 1` |
| `main` + this PR | `ep dll leaked refcount after Unregister+ReleaseEnv
= 0` |

Tracking issue: #28395

Co-authored-by: BoarQing <boarqing@users.noreply.github.com>
@adrastogi adrastogi closed this May 14, 2026
@adrastogi
Copy link
Copy Markdown
Contributor Author

We won't be proceeding with this one-off, scrapping the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants