[REFACTOR][SCRIPT] TVMScript dialect-friendly refactor: per-dialect restructure + dialect registry#19479
Open
tqchen wants to merge 10 commits intoapache:mainfrom
Open
[REFACTOR][SCRIPT] TVMScript dialect-friendly refactor: per-dialect restructure + dialect registry#19479tqchen wants to merge 10 commits intoapache:mainfrom
tqchen wants to merge 10 commits intoapache:mainfrom
Conversation
Contributor
There was a problem hiding this comment.
Code Review
This pull request restructures TVMScript by moving dialect-specific parser and builder components for Relax and TIRX into their respective dialect directories, decoupling them from the core IR-layer infrastructure. Key changes include the implementation of a Python dialect registry with a custom import redirector for backward compatibility, and a registry-based dispatch in C++ for function metadata derivation. Feedback suggests that the duplication of the TVM_SCRIPT_REPR macro across dialect headers is redundant, as these headers already maintain a dependency on the shared utility header where the macro was originally defined.
tlopex
approved these changes
Apr 29, 2026
…ipt/
Continues the per-dialect TVMScript restructure (commit 2 of 4): the TIRX
printer and ir_builder pieces move from src/script/{printer,ir_builder}/tirx/
into src/tirx/script/{printer,builder}/, joining the rest of the TIRX layer.
Public ir_builder headers move correspondingly to include/tvm/tirx/script/builder/.
The Python parser/builder modules move to python/tvm/tirx/script/, with
backward-compat shim packages left at the old paths so existing imports
(`from tvm.script import tirx as T`, `from tvm.script.ir_builder import tirx`,
`from tvm.script.ir_builder.tirx.utils import buffer_proxy`) keep working.
Notable details:
- src/script/printer/relax/tir.cc (the cross-dialect bridge that prints
TIRX vars under the relax token) updates its `../tirx/utils.h` include
to the new `../../../tirx/script/printer/utils.h` location. This file
stays under src/script/printer/relax/ for now; it will move to
src/relax/script/printer/ in commit 3.
- src/tirx/script/builder/frame.cc rewrites its
`../../../tirx/ir/script/script_complete.h` include as
`../../ir/script/script_complete.h` — same physical file, shorter path
from the new location.
- A separate shim file `python/tvm/script/ir_builder/tirx/utils.py`
re-exports `tvm.tirx.script.builder.utils` so that
`from tvm.script.ir_builder.tirx.utils import buffer_proxy` (used by
Relax legalization) continues to resolve.
…cript/
Continues the per-dialect TVMScript restructure (commit 3 of 4): the Relax
printer and ir_builder pieces move from src/script/{printer,ir_builder}/relax/
into src/relax/script/{printer,builder}/, finishing the per-dialect homing.
Public ir_builder headers move correspondingly to include/tvm/relax/script/builder/.
The Python parser/builder modules move to python/tvm/relax/script/, with
backward-compat shim packages left at the old paths so existing imports
(`from tvm.script import relax as R`, `from tvm.script.ir_builder.relax import ...`,
deep references like `tvm.script.parser.relax.entry.ObjectProxy` and
`tvm.script.ir_builder.relax.ir.py_print`) keep working.
Notable details:
- CMakeLists.txt: the relax glob (line ~292) was per-subdirectory, not
recursive at `src/relax/`, so a new `src/relax/script/*.cc` glob entry
was added. Without it, the CMake configure would silently drop the
Relax script `.cc` files and dispatch would fail at runtime.
- src/relax/script/printer/{distributed.cc, tir.cc} use longer relative
paths (`../../../ir/script/printer/utils.h`,
`../../../tirx/script/printer/utils.h`) for cross-dialect helper
references; the cleanup commit will replace them with public utils
headers.
- The Python shims at python/tvm/script/{parser,ir_builder}/relax/ also
re-export submodules (entry.py, ir.py, distributed/) needed for deep
attribute access patterns in tests.
Final commit (4 of 4) of the per-dialect TVMScript restructure: now that
every dialect owns its own script subtree, replace the recursive glob
`src/script/*.cc` with an explicit list of the five shared core files.
The explicit listing is defense-in-depth: if a future dialect file is
accidentally added under `src/script/{printer,ir_builder}/<dialect>/`,
the build won't silently include it. New per-dialect files belong under
`src/<dialect>/script/`.
Verified post-refactor:
- `grep -rn '#include.*tvm/relax\|tvm/tirx' src/script/` returns 0 hits.
- `grep -rn '#include.*tvm/relax\|tvm/tirx' include/tvm/script/` returns 0 hits.
- Each dialect has `src/<dialect>/script/{printer,builder}/` subtrees and
`python/tvm/<dialect>/script/{parser,builder}/` packages.
- Backward-compat shim packages remain at `python/tvm/script/{parser,
ir_builder}/<dialect>/` re-exporting from the new locations.
The IR-layer ir_builder previously hard-coded `if relax::Function ... else if tirx::PrimFunc ...` in `DeclFunction` to derive the GlobalVar's struct_info, forcing `src/script/ir_builder/ir/ir.cc` to include `<tvm/relax/...>` and `<tvm/tirx/...>` directly. That coupling is exactly the source-level dependency this restructure aims to eliminate. Replace the hard-coded dispatch with a per-dialect registry. The IR layer short-circuits when `func->struct_info_` is already defined and otherwise looks up `script.ir_builder.decl_function.<type-key>`; each dialect that participates in `I.DeclFunction` registers a handler under that key. TIRX registers `script.ir_builder.decl_function.tirx.PrimFunc`, deriving an opaque `relax::FuncStructInfo` from the PrimFunc return type. Adding a new function kind to `I.DeclFunction` is now a single registration in that dialect's builder. `src/script/ir_builder/ir/` no longer includes any `<tvm/relax/...>` or `<tvm/tirx/...>` header.
The `TVM_SCRIPT_REPR` macro lived in the shared `src/script/printer/utils.h`, requiring each per-dialect printer header to reach across to the shared utils.h purely to expand the macro. For a macro this small, duplicating the body in each dialect-local `printer/utils.h` is cheaper than the cross-directory indirection: each dialect's printer translation units are now self-contained for the repr-registration glue. Move the macro out of the shared header and into each dialect-local `printer/utils.h` (IR core, Relax, TIRX). The shared header retains `RedirectedReprPrinterMethod` and the other helper functions, which dialect headers still consume via the existing include. Behavior is unchanged.
Each TVMScript dialect's Python surface lived behind hand-written
re-export shims under ``python/tvm/script/{parser,ir_builder}/<dialect>/``
that explicitly forwarded names to the per-dialect packages. Adding a
new dialect required adding a new shim file in three central locations,
and downstream forks could not slot in their own dialect without
monkey-patching ``tvm.script``.
Introduce an explicit registration API so dialects wire themselves up:
* ``tvm.script.register_dialect(name, module_path)`` records the
mapping.
* ``__getattr__`` on ``tvm.script``, ``tvm.script.parser``, and
``tvm.script.ir_builder`` consults the registry on miss and returns
the corresponding submodule under the registered path
(``tvm.<dialect>.script[.parser|.builder]``).
* ``tvm/__init__.py`` registers the in-tree extension dialects (``tirx``,
``relax``) at the bottom, after their packages have been imported;
out-of-tree dialects can call ``register_dialect`` from their own
packages.
* Each ``tvm.<dialect>.script/__init__.py`` re-exports its parser
surface so ``tvm.script.<name>`` (resolved via the registry) carries
the public API directly.
The IR layer is foundational (``tvm.script`` depends on ``tvm.ir``)
and is not registered as a dialect — its parser / ir_builder remain
as real submodules under ``tvm.script.{parser,ir_builder}.ir``.
To avoid circular imports during dialect bootstrap, ``tvm.script``
imports nothing dialect-specific at module load time; ``from_source``,
``parse``, and ``ir_module`` resolve through ``__getattr__`` as well.
The legacy static shims at ``python/tvm/script/parser/<dialect>/`` and
``python/tvm/script/ir_builder/<dialect>/`` remain in place so both
mechanisms cover the same surface; the next commit removes them.
The previous commit added an explicit dialect-registration mechanism plus
a ``__getattr__`` redirect on ``tvm.script`` and friends. The static
re-export shims under ``python/tvm/script/{parser,ir_builder}/<dialect>/``
for ``tirx`` / ``relax`` and the legacy modules
``python/tvm/script/{relax,tirx}.py`` are now redundant: the registry
resolves the same names lazily.
Delete the static shims for the extension dialects. Statement-form imports
(``import tvm.script.ir_builder.relax``,
``from tvm.script.parser.relax.entry import ObjectProxy``) now go through
a ``sys.meta_path`` finder that resolves
``tvm.script.[parser|ir_builder.]<dialect>[.<sub>]`` to the matching
module under ``tvm.<dialect>.script``. Adding a new extension dialect
requires no edits to the in-tree ``tvm/script/`` package.
The IR layer is foundational and stays under ``tvm.script.{parser,
ir_builder}.ir`` as a real submodule — its files are not shims and are
not deleted.
Update the one in-tree caller of the deleted utility shim
(``relax.transform.legalize_ops.grad``) to import ``buffer_proxy``
directly from ``tvm.tirx.script.builder.utils``.
…ename TVM_SCRIPT_REPR; replace dialect fields with extra_config include/tvm/ir/script_printer.h misplaced PrinterConfig and TVMScriptPrinter under tvm/ir/ — these are printer-layer concepts, not IR-specific. Move to include/tvm/script/printer/config.h next to the rest of the printer's public surface (doc.h, ir_docsifier.h, ir_docsifier_functor.h). Rename TVM_SCRIPT_REPR → TVM_REGISTER_SCRIPT_AS_REPR for clarity: the macro registers Script as the kRepr callback for an object type plus a per-type TVMScriptPrinter::vtable() dispatch entry. The new name is explicit about the registration semantic and aligns with the TVM_REGISTER_* family. Drop hard-coded dialect-specific fields from PrinterConfig (tir_prefix, relax_prefix, show_all_struct_info, buffer_dtype) in favor of a generic ffi::Map<String, Any> extra_config keyed by "<dialect>.<knob>" (e.g. "tirx.prefix", "relax.show_all_struct_info"). Each dialect reads its own keys via GetExtraConfig<T>(key, fallback) with defaults at the call site; the shared core never names a specific dialect through hardcoded fields. ir_prefix and module_alias also flip from std::string to ffi::String for consistency with the rest of the config. RedirectedReprPrinterMethod is de-inlined into a new src/script/printer/config.cc to keep <tvm/runtime/logging.h> out of the public header. ~6 header includers + ~80 macro use sites updated; Python config wrappers updated to write into extra_config with dotted keys.
Move register_dialect calls from python/tvm/__init__.py to each
dialect's own package init, using fully-qualified
tvm.script.register_dialect(...) form:
# python/tvm/tirx/__init__.py
import tvm.script
tvm.script.register_dialect("tirx", "tvm.tirx.script")
The dialect owns its registration; out-of-tree dialects do the same
in their own packages without editing in-tree files.
Add a substantial docstring to tvm/script/__init__.py explaining the
mechanism: _DIALECT_REGISTRY + register_dialect + __getattr__ on
tvm.script and on each subpackage (parser, ir_builder, printer)
appending its suffix + sys.meta_path finder for deep statement-form
imports. Smaller doc-comments on the subpackage __getattr__
implementations and the meta_path finder class.
IR is intentionally not registered as a dialect — its script
handlers live in the shared core (script depends on ir).
…ctRef qualifications The per-dialect restructure moved tirx's external_kernel.py into python/tvm/tirx/script/builder/, which broke its `from ..ir import module_get_attr, module_set_attr` import — the relative path now resolves to tvm.tirx.script.ir (the per-dialect TIRX IR builder), which doesn't have those helpers. They live in the shared IR-layer helper at tvm.script.ir_builder.ir. Switch the import to the canonical path. Also update 4 C++ files for post-rebase compilation: bare ObjectRef references no longer resolve after the runtime/object.h phase-out (apache#19476) dropped the runtime:: alias re-exports. Add explicit ffi:: qualification at four call sites: - include/tvm/script/printer/config.h (RedirectedReprPrinterMethod signature) - src/script/printer/config.cc (matching definition) - src/script/ir_builder/ir/ir.cc (GetGlobalVarStructInfo signature and the registry-lookup cast site) - src/tirx/script/builder/ir.cc (lambda return type for the registered DeclFunction handler) Locally verified: test_tir_call_source_kernel passes, full tests/python/tvmscript/ suite passes (771 passed, 1 xfailed), tests/python/runtime/ green.
fc2b791 to
51477f5
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Restructure TVMScript to be dialect-agnostic at the script-core layer
while letting each extension dialect (TIRX, Relax) own its own
per-dialect script subtree. IR is below script in the dependency
stack and is NOT a peer dialect — its script handlers stay in the
shared core.
This PR folds together two coupled refactors that were initially
opened as separate PRs (#19478 and the original #19479); they
share rename / relocation surface so they ship as one cohesive
change.
What this PR does
Per-dialect script subtree (originally #19479)
src/script/{printer,ir_builder}/{tirx,relax}/tosrc/{tirx,relax}/script/{printer,builder}/.src/script/*.ccCMake glob to the dialect-free core.IRBuilder::DeclFunctionto dispatch via FFI registry(
script.ir_builder.decl_function.<type-key>); removescross-dialect includes from the shared core.
tvm.script.register_dialectAPI +__getattr__+ asys.meta_pathfinder for Python-side dialect discovery.In-tree dialects (tirx, relax) registered centrally in
python/tvm/__init__.py.python/tvm/script/{parser,ir_builder}/{tirx,relax}/.Dialect-agnostic printer config (originally #19478)
include/tvm/ir/script_printer.h→include/tvm/script/printer/config.hnext to the rest of theprinter's public surface. The header is not IR-specific.
TVM_SCRIPT_REPR→TVM_REGISTER_SCRIPT_AS_REPRforclarity (the macro registers Script as the kRepr callback +
per-type vtable dispatch). Aligns with the
TVM_REGISTER_*family.
PrinterConfigfields (tir_prefix,relax_prefix,show_all_struct_info,buffer_dtype) in favorof a generic
ffi::Map<String, Any> extra_configkeyed by"<dialect>.<knob>". Each call site reads via the templatedaccessor
config->GetExtraConfig<T>("...", default).std::stringconfig fields toffi::String.After this lands, the script-printer core knows nothing specific
about any dialect — new dialects plug in via the registry pattern
with zero core edits. Public Python API surface unchanged.