Skip to content

Commit 836e79c

Browse files
Copybara Botshelkesagar29
authored andcommitted
Integrate internal changes
This PR moves the following internal changes to OSS, commit 2986cac97888b5a9c1cd9064e4728cb38ca9dc45 Author: Sagar Shelke <[email protected]> [executor] Add complex type support to `ScalarValue` Previously, ScalarValue which represents scalar runtime value did not support complex type. This MR adds support for complex type by making storage union of real and complex data instaed of just real. MLIR tests are added via constant subgraph execution. commit cf83a0d318b8035695d0b9fd24d578733632e253 Author: Christopher Bate <[email protected]> [compiler] Enable more `stablehlo.dot_general` to TensorRT using `tensorrt.einsum` Previously, we relied on canonicalization of `stablehlo.dot_general` to put all such contraction operations into a form that could be converted to `tensorrt.matrix_multiply`. Based on recent experiments, this can actually produce very inefficient TensorRT programs due to the number of reshapes and transpositions that must be inserted to coerce general `stablehlo.dot_general` into batched matrix multiplications. This change enables conversion of `stablehlo.dot_general` to `tensorrt.einsum`, and the pass and patterns now contain configurable parameters to control whether `tensorrt.einsum` is used as the primary method or only for fallback when conversion to `tensorrt.matrix_multiply` is not possible. A follow on change will revamp the Stablehlo preprocessing that we perform on 'stablehlo.dot_general' to avoid creating inefficient patterns and enable wider use of this pattern. commit 528651ed1cd36c36376180c1c2232526ce972fef Author: Christopher Bate <[email protected]> [compiler] Fix stablehlo-to-scf scalarization heuristics Fixes an issue where float tensors in the 'before' region of converted while loops where scalarized. The transform should only scalarize operands which are likely to be for-style induction variables. commit 1d52e0a9e30dc104178c4761c1a24153abc7ea90 Author: Christopher Bate <[email protected]> [compiler] NFC: Drop dead code from StablehloToExecutableTask commit f1c8d8c7cd860aedfe339d76ef7fb953baf9bd55 Author: Chris Bate <[email protected]> [compiler] Add `plan-promote-host-tensors-to-host-pinned` pass Adds a simple pass to promote "host" tensors to "host-pinned" tensors in common cases where we know a tensor will be transferred between host and device spaces. This pass runs after `plan-optimize-memory-spaces` since the former is sensitive to mismatching host spaces for patterns related to moving tranfers out of loops. commit c27d56ea7a9661395e17fa895c610a79a92fa0c2 Author: Sagar Shelke <[email protected]> [executor] Handle elided dense resource elements attr during translation Translation to executable (which is flatbuffer) uses MLIR attr serialization to serialize `ElementsAttr`. However, this doesn't work when attr is elided dense resource and results in segfault. This MR handles this situation by replacing elided resource with `DenseElementsAttr` of all `one`s (`true` in case of boolean). IR with elided resource is usally seen only during testing of passes and not useful for e2e functional execution. Testing of `ExecuteConstantFoldableSubgraphs` pass is such case. Thus, MLIR test cases for this pass are added. commit 920a84e648833764563d3dc1de544a8f1b9f027e Author: Chris Bate <[email protected]> [tensorrt] Fix TRT layer name generation function The TRT layer naming had some faulty logic that could cause the layer name to grow very large in the process to create a unique name. Fix the issue and use a static counter to reduce time spent in the loop. commit ff0c5fa4bf5321ad0ce18579598c49f4b552fb37 Author: Christopher Bate <[email protected]> Further fixes to LIT configs Previously, we were setting `lit_config.parallelism_group` instead of `config.parallelism_group`. Apparently, the previous method does nothing, only `config.parallelism_group` has any effect. commit d65c220b712c262992dbdf5a87fa3220a06bfb21 Author: Chris Bate <[email protected]> Update LIG test parallelism configs In more recent versions of TensorRT (10.11+ at least), the builder is taking a much larger amount of host memory. This can cause OOM when running the LIT test suites under their existing configurations. This change updates all LIT configs: - Make sure to use `%pick-one-gpu` in the LIT command line to ensure we stall if there are not enough GPU or host resources available. Add a hard limit that there must be at least 5GB of host memory available. - Update configurations to reduce the amount of estimated parallelism by increasing host memory requirements and reducing the amount of host memory to 50% for the purposes of the parallelism calculation. - Force all tests to use a common parallelism group unless otherwise specified in the test config. commit 1f996f607640d81bf7137a4ed874b20c2a16cca2 Author: Christopher Bate <[email protected]> [compiler] Fix failure case in stablehlo-to-scf Fixes a failure case due to one of the recently introduced rewrites in `stablehlo-to-scf`. commit 2779b632465fc3e840f5ce987f6233e824fe2ed3 Author: Christopher Bate <[email protected]> [compiler] Further improvements to plan bufferization pipeline - Split `plan-assign-memory-spaces` into three passes: - `plan-assign-memory-spaces` - `plan-optimize-memory-spaces` - `plan-materialize-explicit-transfers` - The last one is the only new code: `plan-materialize-explicit-transfers` converts `tensor.cast` ops that change the memory space encoding into explicit `bufferization.alloc_tensor` + `bufferization.materialize_in_destination` operations. - Improve handling of `bufferization.alloc_tensor` and optimization of `scf.for` iteration args in `plan-assign-memory-spaces`. - Improve handling of `tensor.reshape` in `plan-assign-memory-spaces`. - Fix handling of `tensor.reshape` when rewriting functions to be in DPS style in `plan-alloc-tensors`. This change also updates the LLVM dependencies in order to cherry-pick fix to the `tensor.reshape` bufferization interface that I merged upstream (llvm/llvm-project#128590). In addition, fix APInt assertions in `plan-execute-constant-foldable-subgraphs`. commit 312170d8cbcd4c1fcf9cefdd628583e3dbdcc4f5 Author: Chris Bate <[email protected]> [compiler] Enable While-to-For conversion in Stablehlo-to-Scf pass Stablehlo only has one type of loop construct, `stablehlo.while`. The `while` loop can represent "for"-style loops as well, but if we only have `scf.while` loops after conversion to SCF, then we miss out on lot of potential optimizations which are rooted on `scf.for`. Experiments show that complicated JAX programs like the PhysicalIntelligence Pi0 model can benefit from converting `scf.while` to `scf.for` where possible. This improves opportunities for constant folding and makes analysis much easier to gauge the benefit of transforms like unrolling. This change adds some patterns to the Stablehlo-to-Scf pass to enable While-to-For conversion after the Stablehlo-to-Scf conversion. This transformation is combined with the Stablehlo-to-Scf conversion because the While-to-For patterns require first scalarizing block arguments of the While operation. The heuristics for which block arguments should be scalarized are implemented as control callbacks for the scalarization patterns. These callbacks need Stablehlo-specific logic, so it makes sense to test the combined conversion as a single pass. From the pass users' perspective, it gives the appearence of going directly from `stablehlo.while` to `scf.for`. The test cases are updated to cover the new patterns. commit 425d19e749104354b5ea9e76e7509d029f9eac59 Author: Chris Bate <[email protected]> [compiler] Fix assign-memory-spaces pass to respect function-level constraints Fixes an issue where the `plan.memory_space` attribute on a function was not being respected when converting function signatures. MR: initialdl/mlir-tensorrt!2146 commit b612d5a22e7e3c4f08bf80fd504df5193b370bd3 Author: Chris Bate <[email protected]> [compiler] Update scf.while detensorization to increase flexibility In order to incorporate the upstream "uplift scf.while to scf.for" transformation as part of the `stablehlo-to-scf` conversion, we need to detensorize the operands of `scf.while` that are likely to correspond to the loop induction variable. This change refactors our existing 'scf.while' detensorization transformation to give more flexibility and control. The TensorKindAnalysis is no longer required in order to use the pattern(s). Detensorization of `after` and `before` arguments of `scf.while` are now controlled separately. commit 3e21bf465b90e1eaaad872da40c305b70253cce0 Author: Chris Bate <[email protected]> [compiler] Improve handling of memory space constraints in the Plan dialect This commit improves the handling of memory space constraints in the Plan dialect. Constraints are now specified using a common attribute 'plan.memory_space' that can be applied to functions or individual arguments/results. In addition, patterns in `plan-alloc-tensors` and `plan-assign-memory-spaces` are updated to avoid introducing unnecessary transfers between memory spaces. commit 36a3b4a77242685e473817cb692a4010f690c0b3 Author: Chris Bate <[email protected]> [compiler] Add plan-buffer-results-to-out-params pass This change adds a new Plan dialect pass `plan-buffer-results-to-out-params`. This pass is based on the upstream Bufferization pass `buffer-results-to-out-params`, but it can handle a wider number of cases (such as promoting dynamic allocations) and uses alias analysis utilities to guard against failure cases that the upstream pass currently cannot handle. These improvements should eventually be upstreamed back to the Bufferization dialect. commit 9e7127ca1e61be72b032a54d270a3da0d75639b2 Author: Chris Bate <[email protected]> [compiler] Update func conversion in host-to-emitc In the EmitC conversion/translation process, you can use `func.func` or `emitc.func` to define functions. Previously, we converted all `func.func` to `emitc.func`. However, `emitc.func` does not have a path for supporting multiple return values. Therefore, prefer use of type conversions on `func.func` instead of converting the entire op to `emitc.func`. Add tests to verify that we can support multiple return values. commit 934db1f78ef3e7bedb67f1252b41ded7419010f8 Author: Chris Bate <[email protected]> [compiler] Fix two host-to-emitc bugs This change fixes two bugs exposed by new 'host-to-emitc' conversion testing: - The `!emitc.size_t` type does not have DataLayout information specified upstream. Therefore, to ensure that the type can be queried using DataLayout, we add a DataLayoutTypeInterface external model to the type. All queries are simply mapped to queries to the `index` type. - The upstream `func.call` conversion has a bug where it does not correctly convert the result types of the call operation, which can lead to a type mismatch for any type that does not have an identity conversion. Additional tests are added to `host-to-emitc`. Eventually the fixes for both these issues should be moved upstream. commit 9d27f08ee4429f4ffbb72023babc193c7724a700 Author: Chris Bate <[email protected]> [common] Add Linalg-to-loops (on tensors) implementation and conversion pass Adds a ToLoopsOpInterface implementation and for Linalg operations. In addition, a conversion pass is added that converts ToLoopOpInterface operations to loops. commit 3a419f120808eafc31f45516977ed6169b809ab9 Author: Chris Bate <[email protected]> NFC: Move ToLoopsOpInterface to 'mlir-tensorrt-common' Moves the ToLoopsOpInterface to the 'mlir-tensorrt-common' project. This is in preperation for enabling the ToLoopsOpInterface on LinalgOp (lowering while still using Tensor types) to replace the `convert-stablehlo-arith-to-scalar` pipeline. MR: initialdl/mlir-tensorrt!2137 commit 442bea12b763dd36fce864695f63896912438d87 Author: Christopher Bate <[email protected]> NFC: Fix formatting across several files commit b2a65bc3e806aaa95d932af512cfa4750a9cbe4e Author: Chris Bate <[email protected]> [executor] Introduce RuntimeSession "features" to control loading of runtime modules Previously, the RuntimeSession would always load all available runtime modules. This causes some inefficiences. For example, in certain integration tests for the Executor runtime, we don't use CUDA at all. However, because CUDA is still initialized by default, we would still require a GPU to be present just to run the integration test. Furthermore, some experimental modules (e.g. Lua cublas module) are not ready for "production" use and are only really invoked inside special integration tests. This change inroduces a notion of "features" to the RuntimeSession and RuntimeSessionOptions. A feature is just a string that identifies a particular runtime component. The particular semantic of a "feature" depends on the the actual runtime implementation. For example, for the LuaRuntimeSession, the feature names correspond to the available Lua "modules" (a module is just a group of C++ Lua extension functions), e.g. "core", "cuda", "tensorrt", etc. The RuntimeSessionOptions gains methods for enabling/disabling features. Certain features cause others to be added to the set automatically, e.g. "tensorrt" and "nccl" both require "cuda" to be added. The API is piped through all the way to the Python bindings to allow control of loaded modules at all levels. To preserve existing behavior, RuntimeSessions created from Python will load all available modules by default, but the `executor-runner|mlir-tensorrt-runner` tools now require features to be explicitly specified. commit b90f8f345b2941e958f3a1cc5bcac21daebe783b Author: Christopher Bate <[email protected]> NFC: Fix include guard for 'mlir-executor/Support/Status.h' commit cdbe1f560483047291a30115a043a60bdce34d99 Author: Sagar Shelke <[email protected]> [compiler/lib] Add stablehlo composite to call pass to pre-processing pipeline This MR adds `StablehloLegalizeCompositeToCallPass` to the pre-processing pipeline. MLIR test is added. commit 6ea3ab77aa2909cee11d08aa24543f247e8a24bf Author: Chris Bate <[email protected]> [compiler] Add "default memory space" to ClusterKindAttrInterface Adds a new method to the ClusterKindAttrInterface so that backends can control the default tensor encoding (#plan.memory_space<..>) assigned by the `plan.assign-memory-spaces` pass at a function-scope level. In addition, we also allow an attribute to override the default space at function argument/results. This override mechnanism was previously lacking and will help resolve a long-standing issue where users cannot control the memory space of arguments/results reliably. commit 0ea59238f5c280ab3ffbc340bb9aee7ed7bfbebb Author: Christopher Bate <[email protected]> [compiler] Fix some issues related to pipeline extension mechanism The StablehloToExecutableTensorRTExtension had both 'disable' and an inherited 'disabled' member variable. Delete the inherited one such it should not have been introduced and was not bound to any option. Further, remove unused 'extensions' vector from CompilationTaskOptionsBase. commit 372476d77fcaa399460965ab7bfc052f0e44c99f Author: Christopher Bate <[email protected]> [executor] Fix ptrtoint and inttoptr op translation to Lua Previously, we could generate conflicting function types (due to pointer address space) when converting `executor.ptrtoint` and `executor.inttoptr` ops to opaque calls. Instead, defer the conversion to function call until the actual Lua translation point. At that point we can generate a function name without having to consider the pointer address space. commit 75d18534fa67b452dd2253d6981bda6954bf1056 Author: Chris Bate <[email protected]> Introduce 'MLIRTensorRTCommmon' sub-project Certain targets need to be used across multiple sub-projects. For example, the 'TensorRTDynamicLoader' target is used in all sub-projects. In addition, the sub-projects need to be independently buildable. This change introduces another sub-project under the 'common' directory where shared code can be placed. This allows us to use `find_package` to declare the dependency, and downstream consumers to meet the requirement using any number of techniques to fullfill the 'find_package' call. commit d7d8104087cf272bdd08f6330f27734754f0d71d Author: Chris Bate <[email protected]> [compiler] Harden `stablehlo.constant` to `arith.constant` conversion There is a utility pass that runs in the stablehlo-to-executable pipeline that converts `stablehlo.constant` to `arith.constant`. This pass can temporarily create invalid IR due to `arith.constant` not supporting signful integer types. If the "verify-each" option is off, then the issue will not be caught since it happens to be self-correcting. However, the issue can still cause verification failures while debugging. This change fixes the issue by adding a `builtin.unrealized_conversion_cast` operation to bridge the type change between signless-and-signfull integer types. commit a500de82a7bd70d6bfe32234719b4daa7cf32a8a Author: Chris Bate <[email protected]> Integrate LLVM at f137c3d592e96330e450a8fd63ef7e8877fc1908 commit cd56aa6a511e2091fcd86106f20d27ff3673db75 Author: Christopher Bate <[email protected]> Fix build with BUILD_SHARED_LIBS=ON The new InferTensorValueRangeInterface was used without correctly specifying the library dependency the PlanIR and StablehloExtIR libraries. commit cf1aff0ad0997947ab87485cfeec4595cb0285d7 Author: Sagar Shelke <[email protected]> [compiler] Maintain output order in TensorRT engine. For TensorRT engine conversion, first step in lowering a cluster containing TensorRT ops is created inline group op. Operands to the yield op (i.e. terminator) of inline group op are values from the cluster that are used outside the cluster. These values are collected by getting uses of each op (with `op->getUses()`) and checking if they are outside the cluster. However, this use order is not deterministic and sometimes it is desired to get yield results in a certian order. This MR makes the following changes, 1. Add a function callback option named `ReorderRegionOpYieldValues` to `mlir::createRegionOpFromCluster` method. This callback function has signature `std::function<void(SetVector<Value> &yieldValues, SmallVectorImpl<Type> &yieldTypes)>` which takes cluster values used outside the cluster (in SetVector) and their types. By default this is set to nullptr. 2. TensorRTToExecutable task is used in cases where a single `func.func` represents a single TensorRT engine. In this case, `ReorderRegionOpYieldValues` callback is implemented to make sure inline group op yield value order is same as func.func return values order. Valid MLIR test is added. GitOrigin-RevId: 630a69d8e14506db43cfefe4be2c790f9352da4f
1 parent 12995a1 commit 836e79c

File tree

175 files changed

+5898
-1976
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

175 files changed

+5898
-1976
lines changed

mlir-tensorrt/CMakeLists.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -180,6 +180,7 @@ endif()
180180

181181
if(MLIR_TRT_ENABLE_TORCH)
182182
find_package(torch_mlir REQUIRED)
183+
include_directories(${torch_mlir_SOURCE_DIR}/include)
183184
endif()
184185

185186
if(MLIR_TRT_TARGET_TENSORRT)
@@ -190,6 +191,8 @@ if(MLIR_TRT_ENABLE_PYTHON)
190191
mlir_tensorrt_find_dlpack()
191192
endif()
192193

194+
find_package(MLIRTensorRTCommon REQUIRED)
195+
193196
#--------------------------------------------------
194197
# Diagnostics
195198
#--------------------------------------------------

mlir-tensorrt/DependencyProvider.cmake

Lines changed: 44 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ if("${MLIR_TRT_USE_LLVM}" STREQUAL "prebuilt")
1717
set(MTRT_BUILD_LLVM_FROM_SOURCE OFF)
1818
endif()
1919

20-
set(MLIR_TRT_LLVM_COMMIT "729416e586fba71b4f63d71b1b5c765aefbf200b")
20+
set(MLIR_TRT_LLVM_COMMIT "f137c3d592e96330e450a8fd63ef7e8877fc1908")
2121

2222
set(mlir_patch_dir "${CMAKE_CURRENT_LIST_DIR}/build_tools/patches/mlir")
2323

@@ -43,7 +43,6 @@ else()
4343
"${mlir_patch_dir}/0005-mlir-memref-Fix-memref.global-overly-constrained-ver.patch"
4444
"${mlir_patch_dir}/0006-mlir-emitc-Fix-two-EmitC-bugs.patch"
4545
"${mlir_patch_dir}/0009-mlir-Support-FileLineColRange-in-LLVM-debug-translat.patch"
46-
"${mlir_patch_dir}/0010-MLIR-Fix-LLVMIRTransforms-build-failure-125485.patch"
4746
# Set the CPM cache key to the Git hash for easy navigation.
4847
PRE_ADD_HOOK [[
4948
list(APPEND _vap_UNPARSED_ARGUMENTS
@@ -102,14 +101,12 @@ set(stablehlo_patch_dir "${CMAKE_SOURCE_DIR}/build_tools/patches/stablehlo")
102101
nv_register_package(
103102
NAME Stablehlo
104103
VERSION 1.9.3
105-
GIT_TAG 459897561d365ef97caba46984847f9184d472ec
104+
GIT_TAG 4bf77d23bd9150782a70d85fda9c12a2dec5328c
106105
GIT_REPOSITORY "https://github.com/openxla/stablehlo.git"
107106
PATCHES
108107
"${stablehlo_patch_dir}/0001-Fix-a-couple-missing-checks-for-static-shapes-in-sta.patch"
109108
"${stablehlo_patch_dir}/0002-cmake-Update-usage-of-HandleLLVMOptions-and-LLVM_DEF.patch"
110-
"${stablehlo_patch_dir}/0003-Don-t-insert-unnecessary-arith.index_cast-ops.patch"
111109
"${stablehlo_patch_dir}/0004-Fix-ZeroExtent-condition-in-simplification-pattern.patch"
112-
"${stablehlo_patch_dir}/0005-Fix-crash-on-ComplexType-in-PointwiseToLinalgMapConv.patch"
113110
"${stablehlo_patch_dir}/0006-Remove-explicit-use-of-LLVMSupport.patch"
114111
"${stablehlo_patch_dir}/0007-Fix-circular-dependence-between-StablehloPasses-and-.patch"
115112
OPTIONS
@@ -123,6 +120,42 @@ nv_register_package(
123120
]]
124121
)
125122

123+
#-------------------------------------------------------------------------------------
124+
# MLIRTensorRTCommon
125+
#
126+
# MLIRTensorRTCommon is a sub-project that contains components used across the
127+
# other sub-projects like MLIRExecutor and MLIRTensorRTDialect.
128+
#-------------------------------------------------------------------------------------
129+
130+
nv_register_package(
131+
NAME MLIRTensorRTCommon
132+
SOURCE_DIR "${CMAKE_SOURCE_DIR}/common"
133+
)
134+
135+
# -----------------------------------------------------------------------------
136+
# NVTX
137+
# -----------------------------------------------------------------------------
138+
139+
nv_register_package(
140+
NAME NVTX
141+
GIT_REPOSITORY https://github.com/NVIDIA/NVTX.git
142+
GIT_TAG v3.1.0
143+
GIT_SHALLOW TRUE
144+
SOURCE_SUBDIR c
145+
EXCLUDE_FROM_ALL TRUE
146+
DOWNLOAD_ONLY TRUE
147+
POST_ADD_HOOK [[
148+
if(NOT TARGET nvtx3-cpp)
149+
add_library(nvtx3-cpp INTERFACE IMPORTED)
150+
target_include_directories(nvtx3-cpp INTERFACE
151+
"$<BUILD_INTERFACE:${NVTX_SOURCE_DIR}/c/include>")
152+
# Ignore some warnings due to NVTX3 code style.
153+
target_compile_options(nvtx3-cpp INTERFACE
154+
-Wno-missing-braces)
155+
endif()
156+
]]
157+
)
158+
126159
#-------------------------------------------------------------------------------------
127160
# MLIR-Executor
128161
#
@@ -164,10 +197,10 @@ nv_register_package(
164197
NAME torch_mlir
165198
GIT_REPOSITORY https://github.com/llvm/torch-mlir.git
166199
GIT_TAG 0bb263e99415d43255350d29263097b4980303bf
167-
PATCHES
168-
"build_tools/patches/torch_mlir/0001-cmake-Allow-finding-Stablehlo-via-find_package.patch"
169-
"build_tools/patches/torch_mlir/0002-Make-compatible-with-more-recent-Stablehlo-version.patch"
170-
"build_tools/patches/torch_mlir/0003-Fix-some-configuration-paths-in-LIT-cfg.patch"
200+
PATCHES
201+
"${torch_mlir_patch_dir}/0001-cmake-Allow-finding-Stablehlo-via-find_package.patch"
202+
"${torch_mlir_patch_dir}/0002-Make-compatible-with-more-recent-Stablehlo-version.patch"
203+
"${torch_mlir_patch_dir}/0003-Fix-some-configuration-paths-in-LIT-cfg.patch"
171204
EXCLUDE_FROM_ALL TRUE
172205
# We need to specify an existing directory that is not actually a submodule
173206
# since GIT_SUBMODULES does not except the empty string due to
@@ -202,7 +235,7 @@ macro(mtrt_provide_dependency method dep_name)
202235
endif()
203236

204237
if("${dep_name}" MATCHES
205-
"^(MLIRExecutor|MLIRTensorRTDialect|Stablehlo|torch_mlir)$")
238+
"^(MLIRExecutor|MLIRTensorRTDialect|Stablehlo|torch_mlir|NVTX|MLIRTensorRTCommon)$")
206239
nv_add_package("${dep_name}")
207240
set("${dep_name}_FOUND" TRUE)
208241
endif()
@@ -230,6 +263,7 @@ macro(mtrt_provide_dependency method dep_name)
230263
find_package(LLVM ${ARGN} BYPASS_PROVIDER)
231264
endif()
232265
endif()
266+
233267
endmacro()
234268

235269
cmake_language(

mlir-tensorrt/build_tools/cmake/Dependencies.cmake

Lines changed: 11 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,16 @@ include(${CMAKE_CURRENT_LIST_DIR}/TensorRTDownloadURL.cmake)
66
# expected version.
77
#-------------------------------------------------------------------------------------
88
macro(get_tensorrt_version nvinfer_version_file out_var)
9-
file(STRINGS "${nvinfer_version_file}" VERSION_STRINGS REGEX "#define NV_TENSORRT_.*")
9+
file(STRINGS "${nvinfer_version_file}" VERSION_STRINGS REGEX "#define (TRT_.+|NV_TENSORRT_.+) [0-9]+")
1010
foreach(TYPE MAJOR MINOR PATCH BUILD)
11-
string(REGEX MATCH "NV_TENSORRT_${TYPE} [0-9]+" TRT_TYPE_STRING ${VERSION_STRINGS})
12-
string(REGEX MATCH "[0-9]+" TRT_${TYPE} ${TRT_TYPE_STRING})
11+
string(REGEX MATCH "(TRT_${TYPE}_ENTERPRISE|NV_TENSORRT_${TYPE}) [0-9]+" TRT_TYPE_STRING ${VERSION_STRINGS})
12+
if("${TRT_TYPE_STRING}" STREQUAL "")
13+
message(FATAL_ERROR "Failed to extract TensorRT ${TYPE} version from ${nvinfer_version_file}")
14+
endif()
15+
string(REGEX MATCH "[0-9]+" "TRT_${TYPE}" "${TRT_TYPE_STRING}")
16+
if("TRT_${TYPE}" STREQUAL "")
17+
message(FATAL_ERROR "Failed to extract TensorRT ${TYPE} version from ${nvinfer_version_file}")
18+
endif()
1319
endforeach(TYPE)
1420
set("${out_var}" "${TRT_MAJOR}.${TRT_MINOR}.${TRT_PATCH}.${TRT_BUILD}")
1521
endmacro()
@@ -50,7 +56,7 @@ macro(configure_tensorrt_python_plugin_header)
5056
if(ARG_INSTALL_DIR)
5157
find_file(
5258
trt_python_plugin_header
53-
NAMES plugin.h
59+
NAMES NvInferPythonPlugin.h plugin.h
5460
HINTS ${ARG_INSTALL_DIR} ${ARG_INSTALL_DIR}/python/include/impl
5561
PATHS ${ARG_INSTALL_DIR} ${ARG_INSTALL_DIR}/python/include/impl
5662
REQUIRED
@@ -60,7 +66,7 @@ macro(configure_tensorrt_python_plugin_header)
6066
else()
6167
find_path(
6268
trt_python_plugin_header
63-
NAMES plugin.h
69+
NAMES NvInferPythonPlugin.h plugin.h
6470
REQUIRED
6571
NO_CACHE
6672
)
@@ -173,36 +179,6 @@ function(find_tensorrt)
173179
)
174180
endfunction()
175181

176-
macro(configure_tensorrt_python_plugin_header)
177-
if(ARG_INSTALL_DIR)
178-
find_file(
179-
trt_python_plugin_header
180-
NAMES plugin.h
181-
HINTS ${ARG_INSTALL_DIR} ${ARG_INSTALL_DIR}/python/include/impl
182-
PATHS ${ARG_INSTALL_DIR} ${ARG_INSTALL_DIR}/python/include/impl
183-
REQUIRED
184-
NO_CMAKE_PATH NO_DEFAULT_PATH
185-
NO_CACHE
186-
)
187-
else()
188-
find_path(
189-
trt_python_plugin_header
190-
NAMES plugin.h
191-
REQUIRED
192-
NO_CACHE
193-
)
194-
endif()
195-
file(MAKE_DIRECTORY "${CMAKE_BINARY_DIR}/include/nvinfer")
196-
file(COPY_FILE "${trt_python_plugin_header}"
197-
"${CMAKE_BINARY_DIR}/include/nvinfer/trt_plugin_python.h"
198-
ONLY_IF_DIFFERENT
199-
RESULT copy_result
200-
)
201-
if(copy_result)
202-
message(FATAL_ERROR "failed to copy TensorRT QDP plugin header: ${copy_result}")
203-
endif()
204-
endmacro()
205-
206182
#-------------------------------------------------------------------------------------
207183
# Download and add DLPack to the build (header only)
208184
#-------------------------------------------------------------------------------------
Lines changed: 8 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,17 @@
1-
From f014186374bb3e71d44648781dc03aaefd29f0d5 Mon Sep 17 00:00:00 2001
2-
From: Christopher Bate <[email protected]>
3-
Date: Fri, 10 May 2024 22:39:44 -0600
4-
Subject: [PATCH 05/10] [mlir][memref] Fix memref.global overly constrained
5-
verifier check
1+
From 07f534dac7f915a496265c14745c0bc643185efe Mon Sep 17 00:00:00 2001
2+
From: Sagar Shelke <[email protected]>
3+
Date: Tue, 1 Jul 2025 00:17:04 +0000
4+
Subject: [PATCH] Apply patch 0005
65

76
---
87
mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp | 10 ++++++----
98
1 file changed, 6 insertions(+), 4 deletions(-)
109

1110
diff --git a/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp b/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
12-
index 4f75b7618d63..f12f41437759 100644
11+
index 11597505e788..66ce4b3638b0 100644
1312
--- a/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
1413
+++ b/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
15-
@@ -1117,7 +1117,7 @@ struct DimOfMemRefReshape : public OpRewritePattern<DimOp> {
14+
@@ -1124,7 +1124,7 @@ struct DimOfMemRefReshape : public OpRewritePattern<DimOp> {
1615
}
1716
} // else dim.getIndex is a block argument to reshape->getBlock and
1817
// dominates reshape
@@ -21,7 +20,7 @@ index 4f75b7618d63..f12f41437759 100644
2120
else if (dim->getBlock() != reshape->getBlock() &&
2221
!dim.getIndex().getParentRegion()->isProperAncestor(
2322
reshape->getParentRegion())) {
24-
@@ -1607,9 +1607,11 @@ LogicalResult GlobalOp::verify() {
23+
@@ -1614,9 +1614,11 @@ LogicalResult GlobalOp::verify() {
2524
// Check that the type of the initial value is compatible with the type of
2625
// the global variable.
2726
if (auto elementsAttr = llvm::dyn_cast<ElementsAttr>(initValue)) {
@@ -37,5 +36,5 @@ index 4f75b7618d63..f12f41437759 100644
3736
<< tensorType << ", but was of type " << initType;
3837
}
3938
--
40-
2.46.0
39+
2.48.1
4140

mlir-tensorrt/build_tools/patches/mlir/0006-mlir-emitc-Fix-two-EmitC-bugs.patch

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
From 47c84211f72fb407d72e2c8f87019802cda30432 Mon Sep 17 00:00:00 2001
2-
From: Christopher Bate <cbate@nvidia.com>
3-
Date: Mon, 27 Jan 2025 08:28:33 +0000
4-
Subject: [PATCH 06/10] [mlir][emitc] Fix two EmitC bugs
1+
From d81aaed8cb0190807dbe378e469fde53101f32eb Mon Sep 17 00:00:00 2001
2+
From: Sagar Shelke <shelkesagar29@yahoo.com>
3+
Date: Tue, 1 Jul 2025 00:19:03 +0000
4+
Subject: [PATCH] Apply patch 0006
55

66
---
77
.../mlir/Conversion/FuncToEmitC/FuncToEmitC.h | 4 +-
@@ -112,18 +112,18 @@ index 0b97f2641ad0..d2f368a7148d 100644
112112
if (failed(
113113
applyPartialConversion(getOperation(), target, std::move(patterns))))
114114
diff --git a/mlir/lib/Target/Cpp/TranslateToCpp.cpp b/mlir/lib/Target/Cpp/TranslateToCpp.cpp
115-
index 01de0e41f203..4f600f92ba6d 100644
115+
index b00820ffc542..803c58cc35c6 100644
116116
--- a/mlir/lib/Target/Cpp/TranslateToCpp.cpp
117117
+++ b/mlir/lib/Target/Cpp/TranslateToCpp.cpp
118-
@@ -273,6 +273,7 @@ private:
118+
@@ -282,6 +282,7 @@ private:
119119
ExpressionOp emittedExpression;
120120
SmallVector<int> emittedExpressionPrecedence;
121121

122122
+public:
123123
void pushExpressionPrecedence(int precedence) {
124124
emittedExpressionPrecedence.push_back(precedence);
125125
}
126-
@@ -670,12 +671,14 @@ static LogicalResult printOperation(CppEmitter &emitter,
126+
@@ -695,12 +696,14 @@ static LogicalResult printOperation(CppEmitter &emitter,
127127
if (auto t = dyn_cast<IntegerAttr>(attr)) {
128128
// Index attributes are treated specially as operand index.
129129
if (t.getType().isIndex()) {
@@ -143,5 +143,5 @@ index 01de0e41f203..4f600f92ba6d 100644
143143
}
144144
}
145145
--
146-
2.46.0
146+
2.48.1
147147

mlir-tensorrt/build_tools/patches/mlir/0008-MLIR-Remove-unnecessary-include-from-MathToEmitC.h-t.patch

Lines changed: 0 additions & 29 deletions
This file was deleted.

mlir-tensorrt/build_tools/patches/mlir/0009-mlir-Support-FileLineColRange-in-LLVM-debug-translat.patch

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,14 @@
1-
From 51c99ccf1a291295aed12a36395760026c268cbb Mon Sep 17 00:00:00 2001
2-
From: Christopher Bate <[email protected]>
3-
Date: Tue, 11 Mar 2025 22:34:24 +0000
4-
Subject: [PATCH 09/10] [mlir] Support FileLineColRange in LLVM debug
5-
translation
1+
From d679ee8c7978fe63321e90c5c6583b604ca3d1a5 Mon Sep 17 00:00:00 2001
2+
From: Sagar Shelke <[email protected]>
3+
Date: Tue, 1 Jul 2025 00:25:43 +0000
4+
Subject: [PATCH] Apply patch 0009
65

76
---
87
mlir/lib/Target/LLVMIR/DebugTranslation.cpp | 8 ++++++++
98
1 file changed, 8 insertions(+)
109

1110
diff --git a/mlir/lib/Target/LLVMIR/DebugTranslation.cpp b/mlir/lib/Target/LLVMIR/DebugTranslation.cpp
12-
index cf734de49acd..c55d9a204468 100644
11+
index 1d3ed6f3262f..93e1d08faf4f 100644
1312
--- a/mlir/lib/Target/LLVMIR/DebugTranslation.cpp
1413
+++ b/mlir/lib/Target/LLVMIR/DebugTranslation.cpp
1514
@@ -547,6 +547,14 @@ llvm::DILocation *DebugTranslation::translateLoc(Location loc,
@@ -28,5 +27,5 @@ index cf734de49acd..c55d9a204468 100644
2827
ArrayRef<Location> locations = fusedLoc.getLocations();
2928

3029
--
31-
2.46.0
30+
2.48.1
3231

Lines changed: 0 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -1,44 +0,0 @@
1-
From c5bef25c87a0e5a2377e6909b812acc9d026c7a2 Mon Sep 17 00:00:00 2001
2-
From: Thomas Preud'homme <[email protected]>
3-
Date: Mon, 10 Feb 2025 19:37:58 +0000
4-
Subject: [PATCH 10/10] [MLIR] Fix LLVMIRTransforms build failure (#125485)
5-
6-
lib/libMLIRLLVMIRTransforms.a fails to build from scratch with the
7-
following error:
8-
In file included from llvm/include/llvm/Frontend/OpenMP/OMPConstants.h:19,
9-
from llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h:19,
10-
from mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h:26,
11-
from mlir/include/mlir/Dialect/LLVMIR/NVVMDialect.h:24,
12-
from mlir/lib/Dialect/LLVMIR/Transforms/InlinerInterfaceImpl.cpp:17:
13-
llvm/include/llvm/Frontend/OpenMP/OMP.h:16:10:
14-
fatal error: llvm/Frontend/OpenMP/OMP.h.inc: No such file or directory
15-
16-
Use a forward declaration for OpenMPIRBuilder in ModuleTranslation.h to
17-
avoid pulling OpenMP frontend header that require generated headers.
18-
---
19-
mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h | 5 +++--
20-
1 file changed, 3 insertions(+), 2 deletions(-)
21-
22-
diff --git a/mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h b/mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h
23-
index 1b62437761ed..6f4a5e1d347a 100644
24-
--- a/mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h
25-
+++ b/mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h
26-
@@ -23,12 +23,13 @@
27-
#include "mlir/Target/LLVMIR/TypeToLLVM.h"
28-
29-
#include "llvm/ADT/SetVector.h"
30-
-#include "llvm/Frontend/OpenMP/OMPIRBuilder.h"
31-
+#include "llvm/IR/FPEnv.h"
32-
33-
namespace llvm {
34-
class BasicBlock;
35-
-class IRBuilderBase;
36-
class Function;
37-
+class IRBuilderBase;
38-
+class OpenMPIRBuilder;
39-
class Value;
40-
} // namespace llvm
41-
42-
--
43-
2.46.0
44-

0 commit comments

Comments
 (0)