Skip to content

refactor[next]: Split compilation into a actual compilation and loading of the compilation artifact#2587

Open
havogt wants to merge 26 commits intoGridTools:mainfrom
havogt:worktree-otf-build-finalize-split
Open

refactor[next]: Split compilation into a actual compilation and loading of the compilation artifact#2587
havogt wants to merge 26 commits intoGridTools:mainfrom
havogt:worktree-otf-build-finalize-split

Conversation

@havogt
Copy link
Copy Markdown
Contributor

@havogt havogt commented Apr 27, 2026

This is a preparation to allow compilation in separate processes for faster jit compilation.

The new workflow: CompilationStep produces a CompilationArtifact which is picklable and has a load() method which returns the original ExecutableProgram.

@havogt havogt requested a review from Copilot April 27, 2026 20:14
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Prepares the “next” compilation pipeline for out-of-process / multi-process JIT by splitting “compile” into producing a (mostly) picklable compilation artifact and a separate load() step that materializes the in-process executable callable.

Changes:

  • Introduce stages.CompilationArtifact protocol and update OTF compilation contracts to return artifacts with load().
  • Refactor CPP (nanobind/GTFN) and DaCe compilation steps to return artifact dataclasses instead of live callables; move “decoration” into artifact.load().
  • Update Backend.compile() and tests to materialize executables via artifact.load() and add pickle round-trip contract tests.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/next_tests/unit_tests/program_processor_tests/runners_tests/test_gtfn.py Updates assertions to reflect compilation.device_type living on the compilation step.
tests/next_tests/unit_tests/program_processor_tests/runners_tests/dace_tests/test_dace_compilation.py Adds pickle round-trip contract test for DaCe artifact (ensuring live handles don’t serialize).
tests/next_tests/unit_tests/otf_tests/test_compiled_program.py Adapts a hijacking test to return a dummy artifact with load() instead of a raw callable.
tests/next_tests/unit_tests/otf_tests/compilation_tests/test_compiler.py Adds pickle round-trip contract test for CPP artifact.
tests/next_tests/integration_tests/feature_tests/otf_tests/test_nanobind_build.py Updates integration flow to call .load() on the returned compilation artifact.
src/gt4py/next/program_processors/runners/roundtrip.py Wraps roundtrip’s in-memory callable in a RoundtripArtifact implementing load().
src/gt4py/next/program_processors/runners/gtfn.py Introduces GTFNCompilationArtifact/GTFNCompiler to stamp device type and apply argument conversion in load().
src/gt4py/next/program_processors/runners/dace/workflow/factory.py Removes separate decoration stage wiring (now handled by artifact load()).
src/gt4py/next/program_processors/runners/dace/workflow/decoration.py Adjusts imports/types to avoid cycles after moving decoration into artifact load().
src/gt4py/next/program_processors/runners/dace/workflow/compilation.py Adds DaCeCompilationArtifact and makes DaCe compilation return an on-disk artifact with load().
src/gt4py/next/otf/stages.py Adds CompilationArtifact protocol.
src/gt4py/next/otf/recipes.py Updates OTFCompileWorkflow typing to end at CompilationArtifact (removes decoration step).
src/gt4py/next/otf/definitions.py Updates CompilationStep protocol to return CompilationArtifact.
src/gt4py/next/otf/compilation/compiler.py Refactors CPP compilation into CPPCompiler + CPPCompilationArtifact(load).
src/gt4py/next/backend.py Changes backend executor to return artifacts; compile() now calls artifact.load().

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/gt4py/next/otf/stages.py
Comment on lines +163 to +164
with dace.config.set_temporary("compiler", "use_cache", value=True):
sdfg_program = sdfg.compile(validate=False)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, should be fine because we only reach load() because we went through a build step first, therefore load() is read only.

Comment thread tests/next_tests/unit_tests/otf_tests/compilation_tests/test_compiler.py Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the “next” OTF compilation workflow so that compilation steps produce a picklable CompilationArtifact with a load() method, enabling future compilation in separate processes while keeping load/materialization process-local.

Changes:

  • Introduces stages.CompilationArtifact and updates CompilationStep, OTFCompileWorkflow, and Backend.compile() to compile → return artifact → load() into an ExecutableProgram.
  • Adds/updates concrete artifacts and compilers (CPP/GTFN/DaCe/Roundtrip) so backend-specific loading/wrapping happens in artifact.load() rather than a separate decoration stage.
  • Updates tests and adds minimal “pickle round-trip” contract tests for the new artifact types.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/next_tests/unit_tests/program_processor_tests/runners_tests/test_gtfn.py Updates assertions to reflect device type living on the compilation step/artifact.
tests/next_tests/unit_tests/program_processor_tests/runners_tests/dace_tests/test_dace_compilation.py Adds pickle round-trip contract test for DaCeCompilationArtifact (ensures live handle is dropped).
tests/next_tests/unit_tests/otf_tests/test_compiled_program.py Adjusts test to use a dummy artifact (load() returns a no-op callable).
tests/next_tests/unit_tests/otf_tests/compilation_tests/test_compiler.py Adds pickle round-trip contract test for CPPCompilationArtifact.
tests/next_tests/integration_tests/feature_tests/otf_tests/test_nanobind_build.py Updates integration tests to build → artifact → .load() and passes device_type.
src/gt4py/next/program_processors/runners/roundtrip.py Splits roundtrip into source generation + module load, introduces RoundtripArtifact.load().
src/gt4py/next/program_processors/runners/gtfn.py Introduces GTFNCompilationArtifact/GTFNCompiler so wrapping happens on load().
src/gt4py/next/program_processors/runners/dace/workflow/factory.py Removes decoration stage wiring; compilation now returns an artifact.
src/gt4py/next/program_processors/runners/dace/workflow/decoration.py Adjusts imports/types to avoid cycles after moving decoration to artifact load.
src/gt4py/next/program_processors/runners/dace/workflow/compilation.py Introduces DaCeCompilationArtifact and updates compiler to return it (with optional process-local live-program cache).
src/gt4py/next/otf/stages.py Adds CompilationArtifact protocol.
src/gt4py/next/otf/recipes.py Updates OTFCompileWorkflow typing to end at CompilationArtifact (removes decoration stage).
src/gt4py/next/otf/definitions.py Updates CompilationStep protocol to return CompilationArtifact.
src/gt4py/next/otf/compilation/compiler.py Refactors CPP compiler to return CPPCompilationArtifact and adds load().
src/gt4py/next/backend.py Updates Backend.compile() to call artifact.load() before returning an executable.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +136 to +139
Protocol. Implementations are frozen dataclasses, picklable, and have no
live process-bound state — that is reconstructed by :meth:`load`,
which returns a directly-callable :class:`ExecutableProgram` taking
gt4py-shaped arguments.
Comment on lines +158 to +164
def _load_compiled_program(self) -> CompiledDaceProgram:
sdfg = dace.SDFG.from_file(str(self.sdfg_dump))
sdfg.build_folder = str(self.build_folder)

with gtx_wfdcommon.dace_context(device_type=self.device_type):
with dace.config.set_temporary("compiler", "use_cache", value=True):
sdfg_program = sdfg.compile(validate=False)
Comment on lines +218 to +222
source_code: str
entry_point_name: str
column_axis: common.Dimension | None
dispatch_backend: next_backend.Backend | None
debug: bool
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants