Introduction
The TVM community has worked since the last release to deliver the following new exciting improvements!
The main tags are below (bold text is with lots of progress ): Relax etc.
Please visit the full listing of commits for a complete view: v0.24.dev0...v0.24.0.rc0 .
Community
None.
RFCs
None.
Adreno
#18867 - Revive and consolicate Adreno features
Arith
#19417 - Expose allow_override parameter in Python Analyzer.bind()
BugFix
#19432 - [Fix][CUDA] Version compatibility of CUDA symbols
#19427 - [FIX] Skip metal target tag registration for unsupported LLVM CPUs
#19390 - [LLVM] Fix insertDeclare API mismatch for ROCm-bundled LLVM 20
#19410 - [Fix][Runtime][RPC] Fix remote tensor handle cleanup for RPC return values
#19385 - [MetaSchedule] Fix compile_relax to apply MetaScheduleApplyDatabase after FuseOps
#19383 - [TIRx] Fix bad-optional-access in BF16/FP8 legalize passes for target-less PrimFuncs
#19382 - [TIRx] Fix VerifyMemory crash for PrimFuncs without target attribute
#19380 - [TOPI] Fix get_const_tuple hanging indefinitely when passed a te.Tensor
#19368 - Align tir.round to ties-to-even across all backends
#19367 - [ONNX] Fix Round op to use ties-to-even
#19362 - [TVMScript] Fix invalid f-string format spec causing TypeError on Python 3.14
#19352 - [TVMScript] Add doc.keyword handling for ExprEvaluator._visit
#18957 - [FIX] Inline ceil_log2 in gpu_2d_continuous_cumsum to fix MakePackedAPI error
#18940 - [Fix] Fix tvm.tir references in Tflite frontend
#18887 - [FIX] Fix cumsum kernel sblock_alloc_buffer for non-sblock buffer
#18881 - [FIX][Adreno] Replace AllocBuffer with Bind in texture alloc injection
#18838 - [TOPI] Fix resize accuracy issue with non-floor rounding
#18782 - [S-TIR][FIX] Remove redundant std::move() to itself
#18742 - [Fix] Handle empty variable name in NameSupply::FreshName
#18694 - [TIR] Fix incorrect optimization when lowering floordiv and f…
#18695 - [FIX] Fix T.sblock due to concurrent merge
CI
#19445 - [REFACTOR] Decouple data.py from Jenkins script and docker images
#18827 - Update images to 20260301-134651-63f099ad
#18863 - [S-TIR][Test] Mark meta_schedule tuning tests as skip
#18851 - Remove stale test scripts (i386, hexagon, mypy)
#18850 - [TEST] Remove stale URL mappings from request_hook
#18848 - Remove legacy lint scripts and Apache RAT
#18817 - [REFACTOR]Further cleanup docker images
#18812 - [REFACTOR]Modernize Python dependency management with uv
#18809 - Add GitHub Actions lint workflow
#18805 - [REFACTOR][TEST] Migrate tir-transform tests from TE to TVMScript
#18804 - [REFACTOR][TEST] Remove unused te imports from test files
#18800 - Update images to 20260219-160550-72f51851
#18796 - Refactor Dockerfiles and installation scripts
#18775 - Update images to 20260214-152058-2a448ce4
#18783 - Update system cuda version 12.4->12.8
#18780 - Remove unity from tvm-bot
#18777 - Update Pillow, pytest-rerunfailures, junitparser, xgboost, onnx and pytorch
#18647 - Upgrade Python to 3.10 in CI
#18749 - Remove i386 and Hexagon from CI pipeline (2)
#18757 - Further cleanup CI after merging unity to main test
#18456 - Move conda config files to tests/conda and remove unused conda build infrastructure
#18755 - [TEST] Cleanup legacy tests and migrate unity tests to main one
#18737 - Remove i386 and Hexagon from CI pipeline (1)
#18748 - Remove i386 and hexagon from .asf.yaml
#18719 - [REFACTOR][TEST] Migrate all codegen test to tvmscript
#18717 - Fix double newlines in nightly docker update
#18711 - [REFACTOR][TEST] Replace CompareBeforeAfter for pytest compact
#18692 - Fix NameError in nightly docker update workflow
Docker
#18854 - Refactor bash.sh: auto-detect rootless, add --shell, TVM_DEV_MOUNTS
#18710 - [ci]Nightly Docker image update
Docs
#19439 - Refactor BYOC example NPU tutorial
#19414 - Fix stale tvm.tirx exclude list and add missing legalize_ops.unary entry
#19409 - Fix outdated source install and API reference docs
#19407 - Fix [Docs] python -c "import tvm; print(tvm.file)" fail #18714 : python -c "import tvm; print(tvm.file)" fail
#19396 - Add code generation architecture documentation
#19398 - Add TVMScript architecture documentation
#19397 - Add PyModule tutorial to How-To toctree
#19399 - Clean up architecture docs: remove duplicates, fix stale content
#19389 - Add Relax VM architecture documentation
#19394 - Add operator fusion architecture documentation
#19395 - Add BYOC external library dispatch architecture documentation
#19387 - Add docstrings for nn.Module classes and core APIs in relax.frontend.nn
#19386 - Add tvm.s_tir.tensor_intrin API reference and remove empty legacy tvm/tir directory
#19379 - Add API reference for tvm.arith, tvm.testing, tvm.exec, tvm.tirx.backend and extend topi/contrib/ir/target docs
#19369 - Add API reference for tvm.s_tir submodules: dlight, meta_schedule, backend
#19366 - Add API reference documentation for tvm.script module
#19356 - Add DLight and MetaSchedule deep-dive instructions
#19364 - TFLite tests requiring Python 3.10 and specific package versions to avoid core dumps
#19354 - Add tutorial for importing models from PyTorch, ONNX, and TFLite
#19358 - Add Dataflow Pattern Language (DPL) documentation for Relax
#19357 - Add Disco distributed runtime architecture overview
#19351 - Fix outdated paths, links, and add missing API references across documentation(3)
#19353 - Add tvm.s_tir.analysis API reference page
#19350 - Add Relax VM architecture overview in documentation
#19344 - Fix outdated code examples, typos, and missing API reference in documentation(2)
#18965 - Fix outdated code examples, types, and missing references across documentation
#18966 - [DOC] Fix various issues
#18953 - Align documentation with tirx/s_tir namespace split
#18947 - Add tutorial for mixing Python/PyTorch with TVM using BasePyModule
#18939 - [DOC] Fix inconsistent code comments
#18941 - Fix duplicate license headers and incorrect module paths after tirx rename
#18908 - Clean up stale references from recent refactors
#18906 - Update outdated references from recent refactors
#18860 - [CI]Update Sphinx dependencies
#18855 - Fix RPC tutorial to use set_input + invoke_stateful API
#18808 - [DOC] Update installation docs with missing dependencies ([Bug] the latest doc about installing tvm from source code is not complete. #18194 )
#18799 - [DOC] Fix docstring, unify CMake, nvidia-docker deprecation
#18797 - [DOC] Unify CUDA naming
#18794 - [DOC] Unify GitHub naming
#18770 - [DOC] Fix PYTHONPATH in "Install from Source"
#18732 - [DOC] Fix RST syntax
#18753 - [DOC] Fix the loop length in a loop tiling example
#18731 - [DOC] Fix grammar
#18718 - Clarify trusted usage
Frontend
#19401 - [TFLite] Add test coverage for SHAPE and RANGE operators
#19402 - [Test][TFLite] Add unit tests for PRELU
#19400 - [TFLite] Add TILE operator tests and edge cases
#19388 - [Test][TFLite] Add unit tests for LEAKY_RELU, HARD_SWISH ReLU_N1_to_1 and LOG_SOFTMAX
#19365 - [Test][TFLite] Add unit tests for RESIZE_BILINEAR and RESIZE_NEAREST_NEIGHBOR ops
#18970 - [TFLite]Add expected IRModule checks for conv2d, pool2d, and batch_matmul tests
#19341 - [ONNX] Fix SplitToSequence keepdims=0 and uneven last chunk
#18969 - [ONNX] Support select_last_index for ArgMax and ArgMin
#18951 - [ONNX] Add MatMulInteger support to Relax ONNX frontend
#18946 - [ONNX] Add If operator support to Relax ONNX frontend
#18929 - [TFLite] Fix undefined symbols and Relay API remnants in TFLite frontend
#18773 - [ONNX] Handle Gelu approximate attribute from Opset 20
LLVM
#18909 - [Target]Fix -mcpu validation compatibility across LLVM versions
#18853 - Bump minimum LLVM version to 15
#18818 - Fix build failures when building with llvm>=22
#18772 - [Codegen] Cast NaN to bool gives true
#18706 - Fix insertDbgValueIntrinsic for Metal backend
MetaSchedule
#19438 - [S-TIR]Make evolutionary search resilient to trace replay failures
Metal
#19493 - Include logging headers for metal
#18877 - Batched command dispatch and staging buffer pool
#18819 - [REFACTOR]Update CHECK_LE to TVM_FFI_ICHECK_LE in Metal runtime
#18811 - [Refactor]Update ICHECK to TVM_FFI_ICHECK in Metal runtime
ROCm
#15518 - Fix some ROCm codegen bugs
Relax
#19492 - [BugFix]Add legalize for isnan, isinf, isfinite
#19489 - [Frontend][TFLite] Add BROADCAST_TO, EMBEDDING_LOOKUP, and SELECT_V2
#19490 - [Frontend][TFLite] Add SCATTER_ND operator for Relax TFLite
#19467 - [ONNX] Fix CumSum axis handling: support runtime axis tensor, error on multi-element axis
#19473 - [Frontend][TFLite] Add RANDOM_UNIFORM, RANDOM_STANDARD_NORMAL, and MULTINOMIAL
#19487 - [Frontend][TFLite] Add BROADCAST_ARGS operator mapping
#19481 - [Frontend][TFLite] Add DILATE operator mapping
#19485 - [Frontend][TFLite] Add ATAN2 op and TFLite mapping
#19480 - [BugFix][ONNX] Fix ConstantOfShape converter when value attr is absent
#19468 - [Frontend][TFLite] Fix STRIDED_SLICE negative stride and add STRIDED_SLICE/SPLIT_V tests
#19421 - [Frontend][TFLite] Add DENSIFY operator test and fix prefetched handling
#19464 - [Frontend][TFLite] Add NON_MAX_SUPPRESSION_V4 converter
#19466 - [Frontend][TFLite] Add BITCAST operator mapping
#19433 - [Frontend][TFLite] Fix dynamic FILL/SPLIT_V partial implementations
#19426 - [Frontend][TFLite] Add soft-NMS support for TFLite NON_MAX_SUPPRESSION_V5
#19450 - [BugFix][ONNX] Honor auto_pad in ConvTranspose converter
#19431 - [Frontend][KVCache] Extend masked sequence prefill to causal left-padding
#19434 - [Frontend][TFLite] Add CUMSUM operator mapping
#19430 - [NN] Use int64 for RoPE apply flag
#19428 - [FRONTEND][ONNX] Support Softmax, LogSoftmax and Hardmax when opset version ≤12
#19425 - [Backend]Add NPU BYOC backend example
#19424 - Fix deprecation warning
#19416 - [TVMScript] Print ExternFunc struct_info when non-default
#19415 - [Frontend][TFLite] Fix bool REDUCE_ANY/REDUCE_ALL compile failure
#19413 - [Frontend][TFLite] Add REDUCE_ANY and REDUCE_ALL
#19411 - fix 'occured' -> 'occurred' in transform.h doc comment
#19408 - [Frontend][TFLite] Fix and test MATRIX_DIAG, MATRIX_SET_DIAG, SPARSE_TO_DENSE
#19405 - [Frontend][KVCache] Restructure kv_cache kernels
#19404 - [tflite] Add PRELU/LRN/SQUARED_DIFFERENCE tests (partial [Tracking Issue][TFLite] Expand unit test coverage for supported non-quantized operators #18971 )
#19392 - [Frontend][KVCache] Add masked sequence prefill helper for encoder valid lengths
#19372 - [frontend][tflite] Add tests for fully_connected/depthwise_conv2d/transpose_conv/l2_pool2d
#19391 - [ONNX] Add frontend support for QuantizeLinear, DequantizeLinear, and DynamicQuantizeLinear
#19384 - [BugFix]Select target-specific pipeline in tvm.compile when GPU target is provided
#19371 - [frontend][tflite] Add tests for l2_normalization/slice/reverse_v2
#19345 - [Frontend][TFLite] Implement DETECTION_POSTPROCESS tflite operator
#19381 - [TFLite] Fix and test DEPTH_TO_SPACE/SPACE_TO_DEPTH, SELECT ops
#19373 - [TFLite] Fix MIRROR_PAD/ONE_HOT converters and add tests for PAD, PADV2, MIRROR_PAD, TOPK_V2, ONE_HOT
#19370 - [TFLite] Add test coverage for Reduction operations ([Tracking Issue][TFLite] Expand unit test coverage for supported non-quantized operators #18971 )
#19361 - [ONNX] Support ConcatFromSequenc/SequenceInsert with new_axis=1
#19349 - [TFLite] Add NON_MAX_SUPPRESSION_V5 support
#18963 - [ONNX] Support Resize dynamic ROI via TOPI
#18955 - [ONNX] Fix shape/dynamic restrictions for Squeeze/Unsqueeze and Slice
#18956 - [ONNX] Complete ShapeExpr reshape handling in ONNX frontend
#18950 - [ONNX] Add Optional and MatMulInteger16 frontend support
#18952 - [ONNX] Add roi_pool op and MaxRoiPool frontend support
#18948 - Add conv3d_transpose and ONNX ConvTranspose 3D support
#18943 - [Vision] Add get_valid_counts and classic NMS
#18942 - [TOPI] Add relax.vision.multibox_transform_loc for SSD/TFLite box decode
#18933 - Add affine_grid operator with PyTorch and ONNX frontend support
#18937 - [PyTorch] Add 3D interpolate support using resize3d
#18936 - [ONNX][Torch] Add roi_align support and frontend integration
#18931 - [ONNX] Add image.resize3d op and wire 5D Resize
#18932 - [ONNX] Add GridSample ONNX frontend integration
#18868 - [TFLite] Introduce TensorFlow Lite frontend
#18869 - [LAYOUT] Support multiple axis paching
#18904 - [PyTorch] Add torch.cond support to ExportedProgram frontend
#18870 - Add input type validation for make_shape and corresponding tests
#18903 - [PyTorch] Fix crash on dynamic shapes with identity slice in ExportedProgram importer
#18878 - [ONNX] Support dynamic repeats for Tile
#18864 - [Refactor] Phase out FewShotTuning
#18814 - Make ShapeType ndim parameter mandatory
#18815 - [PyTroch] Add randn.default and randn_like.default support
#18520 - Fix llama4_rope_with_position_map to support partial rotary factor
#18764 - Add size heuristic to skip folding large creation ops
#18762 - Remove TODO comment for moving code in fuse_tir.cc
#18733 - Migrate NN conv/pooling/grad attrs from Array to Array<int64_t>
#18736 - Support constant folding for call_tir with tuple outputs
#18726 - [PyTorch] Simplify tensor args conversion in Dynamo
#18725 - [PyTorch] Fix scalar parameter inputs in Dynamo
#18670 - [Torch] Avoid decomposition crash with sparse CSR buffers
#18704 - [Onnx][BatchNorm] Pass momentum and training_mode into BatchNorm Operator
#18661 - [Python]Fix YaRN correction dim calculation
#18691 - [Onnx][Resize] Fix ROI values when tensor ROI is Empty pass node Constant
#18673 - [Onnx] Support Multi Input Ops with Multidirectional Broadcasting
#18677 - [NN] Add batch_flatten operator
#18690 - Add NN operator attributes include to TensorRT codegen
Runtime
#19476 - [REFACTOR]Phase out include/tvm/runtime/object.h
#19465 - [REFACTOR][CODEGEN] Backend specific target and runtime to enable cross-compile fallback
#19471 - [REFACTOR]Phase out IntTuple alias; use ffi::Shape directly
#19472 - [REFACTOR]Phase out include/tvm/runtime/builtin_fp16.h
#19469 - [REFACTOR]Phase out include/tvm/runtime/threading_backend.h
#19455 - [REFACTOR]Phase out profiling.h heavy types, rename to timer.h
#19457 - [REFACTOR]Macro cleanup — TVM_DLL alignment, [[maybe_unused]], logging.h legacy macros
#18837 - [Builtin] Handle mismatched type on argument #0 when calling Builtin Runtime Operators
#18813 - [REFACTOR]Phase out legacy contrib runtime backends
#18784 - [REFACTOR]Transition metadata into ffi
#18756 - [COMPACT] Fix 32bit compact in vm
TOPI
#18880 - Reject non-float inputs for inverse unary math ops
TVMScript
#18891 - Remove T.Bind backward-compat alias
#18889 - Normalize T.Bind to T.bind for statement builder convention
#18856 - Fix PEP 563 closure variable resolution
Vulkan
#18914 - Avoid explicit layout decoration on non-interface allocations
web
#18944 - Update includes after FFI JSON refactor
#18893 - [Experimental] Add support for cross-origin storage caching
#18680 - [Version] Fix WebLLM vision model issues
#18687 - Handle LocalSession init in WASM RPC server
Misc
#19483 - [REFACTOR][FFI] Cleanup ffi indirections in tvm headers + switch logging.h to ffi/error.h where only ICHECK/THROW are used
#19484 - [FFI][ABI] Bump tvm-ffi to 0.1.11rc2
#19477 - [REFACTOR] Delete src/support/libinfo.cc; replace with runtime FFI-registry env query
#19479 - [REFACTOR][SCRIPT] TVMScript dialect-friendly refactor: per-dialect restructure + dialect registry
#19475 - [REFACTOR][S-TIR] Move tvm/support/random_engine.h → tvm/s_tir/random_engine.h
#19474 - [REFACTOR][IR] Move tvm/support/with.h → tvm/ir/with_context.h
#19453 - [S-TIR][Dlight] Add layered fall back strategy to handle missing attr max_shared_memory_per_block
#19463 - [REFACTOR][IR] Migrate include/tvm/node into include/tvm/ir
#19462 - [REFACTOR][NODE] Use fn_repr inside kRepr lambdas, not ffi::ReprPrint
#19460 - [REFACTOR][S-TIR] Minimize src/support/ by relocating s_tir-private headers
#19459 - [REFACTOR] Phase out src/support/ffi_testing.cc
#19461 - [REFACTOR][NODE] Migrate ReprPrinter to tvm-ffi ffi_repr mechanism
#19456 - [REFACTOR] Move source_utils.h into runtime/opencl
#19458 - [REFACTOR] Phase out unreachable contrib/rust_extension.cc
#19454 - [REFACTOR][CODEGEN] Phase out tvm_global_barrier_state and tvm_prepare_global_barrier
#19449 - [REFACTOR] Use FFI types in runtime inline module-create wrapper signatures
#18406 - [TIR] Update symbolic index term order in loop fusion
#19447 - [REFACTOR] Isolate backend module creation via ffi.Module.create. registry
#19444 - [CMAKE][REFACTOR] Split libtvm.so into libtvm_runtime.so and libtvm_compiler.so
#19440 - [REFACTOR] Remove runtime/object.py shim and route Object via tvm_ffi
#19442 - [REFACTOR] Remove tvm.runtime.packed_func and container shims; route via tvm_ffi
#19441 - [REFACTOR] Phase out include/tvm/runtime/module.h
#19393 - fix: use is None instead of == None in test files (PEP 8 E711)
#19406 - [S-TIR] Fix cache_read/cache_write region when inner block has T.whe…
#19403 - [S-TIR] Fix Segfault when applying Parallel during TIR schedule rewriting
#18927 - feat(meta_schedule): expand CUDA unroll steps for SM70 optimization
#19347 - fix: TFLite model retrieval with error handling
#19343 - test(relax): cover TFLite LOG and GREATER_EQUAL in test_frontend_tflite
#18938 - [FFI] Bump tvm-ffi to 63224e3 and fix regressions
#18912 - [TIR] Handle Bind in LowerDeviceKernelLaunch
#18926 - Revert "fix: add safety warning to pickle_memoize cache loading"
#18925 - fix: add safety warning to pickle_memoize cache loading
#18913 - [Refactor] Bring up tirx namespace
#18240 - [Optimization][Operator] Implement and enable Conv2d-Reshape-Add-ReLU fusion
#18892 - [Build] Fix version regex to anchor at line start in pyproject.toml
#18879 - [TIR] Reject non-floating inputs for trig unary ops
#18886 - [TIR][REFACTOR] Revamp Common Subexpression Elimination
#18883 - [TARGET] Fix round-trip reconstruction of targets with canonicalizer-generated feature.* attrs
#18876 - [REFACTOR][TIR] Remove body from AllocBuffer and DeclBuffer
#18871 - Batched GPU dispatch and object caching for WebGPU runtime
#18875 - [chore] Update docker/README.md documentation and fix links
#18873 - [TIR] Add VisitBufferDef/VisitBufferUse to base StmtVisitor/StmtMutator
#18865 - [REFACTOR][TIR] Introduce AllocBuffer and phase out Allocate+DeclBuffer
#18862 - [REFACTOR][TIR] Cleanup AttrStmt attributes
#18857 - [TIR][Refactor] Enhance error reporting with structured AssertStmt and TVMFFIABIBuilder
#18861 - fix: Complete CHECK update across contrib runtime
#18859 - fix: Use T.decl_buffer instead of T.Buffer for aliased buffers in LongRoPE
#18858 - fix: Complete ICHECK update across codebase
#18845 - [REFACTOR][CONTRIB] Remove MSC contrib module
#18847 - [PYTHON] Fix PEP 563 compat and remove args_converter
#18843 - [TIR][FEAT] Require DeclBuffer before use in verify_well_formed
#18852 - [REFACTOR] Remove unused mscclpp contrib module
#18849 - [CMAKE] Remove unused Libbacktrace.cmake
#18844 - [REFACTOR] Further cleanup node redirections
#18830 - [LINT][PYTHON] Modernize annotations with ruff UP rules
#18832 - [IR][TIR] Remove body from AssertStmt
#18829 - [REFACTOR][NODE] Remove node redirect headers
#18825 - [REFACTOR] Update CHECK and ICHECK_GE to TVM_FFI_ICHECK and TVM_FFI_ICHECK_GE in thrust.cu
#18828 - [REFACTOR] Phase out root Makefile
#18821 - fix: replace 6 bare except clauses with except Exception
#18822 - [TARGET] Specify correct mcpu for Metal target tags
#18816 - [REFACTOR][S-TIR] Lift STIR-only attributes out of tir::attr namespace
#18810 - [REFACTOR][LINT] Modernize ruff config
#18801 - Bump tvm-ffi to v0.1.9rc
#18807 - [LINT] Modernize lint to use pre-commit hooks
#18803 - [REFACTOR] Migrate CHECK macros to tvm-ffi ones
#18768 - support integer types in fast_tanh and fast_exp
#18802 - [FFI] Bring up latest tvm-ffi
#18793 - [REFACTOR][TARGET] Further cleanup target python api
#18785 - [REFACTOR][TARGET] Phase out legacy target string in favor of json
#18786 - [CONTRIB] Cache the shape and dtype array in json access
#18781 - [chore] cleanup unsed legacy backtrac code in logging
#18779 - [REFACTOR] Phase out dmlc dep
#18776 - [REFACTOR][S-TIR] More migrations to s-tir
#18771 - [REFACTOR][S-TIR] Migrate more transform to s_tir
#18763 - [REFACTOR][TIR] Phaseout BufferRealize
#18759 - [REFACTOR] Remove picojson dependency, replace with tvm::ffi::json API
#18760 - [chore] Cleanup stale dependencies
#18761 - [REFATOR][TIR] Phase out AllocConst
#18758 - [Cleanup] Remove redundant python/pyproject.toml and gen_requirements
#18697 - Fix Customize Optimization tutorial import error [Bug] Customize Optimization Tutorial Error #18584
#18754 - [REFACTOR][S-TIR] Cleanup items on block scope
#18743 - [REFACTOR][S-TIR] Move remaining data structures to s_tir
#18739 - fix: correct typos 'recieve' and 'occurence'
#18744 - fix: skip dsymutil for static tvm_runtime on Apple platforms
#18740 - fix: correct typo 'occuring' to 'occurring'
#18738 - [chore][TIR] reorganize src/tir/transforms to src/tir/transform
#18734 - [REFACTOR][S-TIR] Lift dlight into s_tir namespace
#18735 - [REFACTOR][S-TIR] Migrate meta_schedule into s_tir namespace
#18705 - Add Windows-specific build notes to installation guide
#18727 - fix: correct typos in Python docstrings
#18728 - [REFACTOR][S-TIR] Migrate tir/schedule to s_tir
#18722 - [REFACTOR][S-TIR] Lift transform passes to s_tir namespace
#18724 - Remove cron schedule from nightly Docker update workflow
#18716 - [REFACTOR] Migrate old tir.ir_builder to tvmscript or builder
#18715 - [SPIRV] Fix forloop codegen in vulkan
#18712 - [REFACTOR][S-TIR] Initialize the s_tir module
#18636 - [TIR][Schedule]Generalize fuseReductionEpilogue to support arbitrary epilogue expressions
#18699 - [TIR] Further robustify floordiv/mod intrin lowering to prevent overflow
#18689 - [REFACTOR][TIR] Rename tir.Block to SBlock
#18671 - [TIR] Fix InjectPTXLDG32 segfaults and skip non-CUDA targets
Introduction
The TVM community has worked since the last release to deliver the following new exciting improvements!
The main tags are below (bold text is with lots of progress): Relax etc.
Please visit the full listing of commits for a complete view: v0.24.dev0...v0.24.0.rc0.
Community
None.
RFCs
None.
Adreno
Arith
BugFix
insertDeclareAPI mismatch for ROCm-bundled LLVM 20compile_relaxto applyMetaScheduleApplyDatabaseafterFuseOpstir.roundto ties-to-even across all backendsdoc.keywordhandling forExprEvaluator._visitCI
20260301-134651-63f099ad20260219-160550-72f5185120260214-152058-2a448ce4.asf.yamlDocker
Docs
Frontend
PRELULEAKY_RELU,HARD_SWISHReLU_N1_to_1andLOG_SOFTMAXLLVM
MetaSchedule
Metal
ROCm
Relax
RANDOM_UNIFORM,RANDOM_STANDARD_NORMAL, andMULTINOMIALSTRIDED_SLICEnegative stride and addSTRIDED_SLICE/SPLIT_VtestsREDUCE_ANY/REDUCE_ALLcompile failureREDUCE_ANYandREDUCE_ALLMATRIX_DIAG,MATRIX_SET_DIAG,SPARSE_TO_DENSEMIRROR_PAD/ONE_HOTconverters and add tests forPAD,PADV2,MIRROR_PAD,TOPK_V2,ONE_HOTSqueeze/UnsqueezeandSliceRuntime
TOPI
TVMScript
Vulkan
web
Misc
max_shared_memory_per_blockis Noneinstead of== Nonein test files (PEP 8 E711)feature.*attrsmcpufor Metal target tags