Document JIT compiling and PyTorch handling behavior

stotko · stotko · commit d13a095db72e · 2025-04-17T14:55:27.000+02:00
diff --git a/docs/conf.py b/docs/conf.py
@@ -79,3 +79,5 @@
 # html_css_files = []
 
 html_copy_source = False
+
+# suppress_warnings = ["myst.domains"]
diff --git a/docs/index.md b/docs/index.md
@@ -11,6 +11,8 @@
 :hidden:
 
 Overview <self>
+src/jit_compiling
+src/pytorch_handling
 ```
 
 ```{toctree}
diff --git a/docs/src/jit_compiling.md b/docs/src/jit_compiling.md
@@ -0,0 +1,42 @@
+# JIT Compiling
+
+CharonLoad executes the following steps to JIT compile the C++/CUDA extension:
+
+## 1. (Optional) Clean
+
+Removes any existing old files in the [build directory](#ResolvedConfig.full_build_directory). This optional step can be controlled by setting the [``clean_build``](#ResolvedConfig.clean_build) flag of <project:#ResolvedConfig>.
+
+In addition, cleaning is **automatically** performed if at least one of these conditions are fulfilled:
+- CMake Configure step *failed*.
+- CharonLoad version used in previous run is *incompatible* with current version.  
+  (Same minor version, e.g. 0.3.X is considered incompatible to 0.4.Y.)
+- PyTorch version has *changed* since the previous run.
+
+
+## 2. Initialize
+
+Prepares internal state and places ``.gitignore`` into [build directory](#ResolvedConfig.full_build_directory) to minimize noise.
+
+
+## 3. CMake Configure
+
+Runs CMake on the specified [project directory](#ResolvedConfig.full_project_directory). Additional arguments to the command can be passed via [``cmake_options``](#ResolvedConfig.cmake_options) of <project:#ResolvedConfig>.
+
+Skipped automatically if CMake configuration has not changed from previous run.
+
+
+## 4. Build
+
+Performs parallel compilation of the configured project. If project source files have not changed between runs, the underlying native build tool usually skips unnecessary compilations on its own.
+
+
+## 5. (Optional) Stub Generation
+
+Analyzes the compiled C++/CUDA extension and generates Python stub files. This is useful to enable syntax highlighting and auto-completion of the bindings in VS Code. This optional step can be enabled by specifying the [stubs directory](#ResolvedConfig.full_stubs_directory) of <project:#ResolvedConfig>.
+
+Skipped automatically if the compiled extension file has not changed from a previous run, i.e. the build step did not alter the file.
+
+
+## 6. Import Path
+
+Extends Python's module search paths with the location of the compiled extension to enable ``import`` calls to it. On Windows, the DLL search paths are also extended by the list of shared/dynamic libraries to which the extension links.
diff --git a/docs/src/pytorch_handling.md b/docs/src/pytorch_handling.md
@@ -0,0 +1,52 @@
+# PyTorch Handling
+
+In addition to JIT Compiling, CharonLoad also provides native support for finding and linking against the PyTorch C++ library which is installed as part of the Python package ``torch``. To this end, the CMake function <project:#charonload_add_torch_library> is provided as a thin wrapper around [add_library()](<inv:cmake.org#command/add_library>).
+
+However, the PyTorch C++ API comes with several non-trivial usage requirements that linking libraries must fulfill. Not meeting these requirements will lead to potentially obscure and hard-to-debug compiler and linker errors. Thus, CharonLoad automatically detects these usage requirements and adds them to the CMake target created by <project:#charonload_add_torch_library>.
+
+In particular, the following PyTorch properties are handled:
+
+
+:::::{grid} 2 2 3 3
+:gutter: 3 3 4 4
+
+::::{grid-item-card}
+:link: pytorch_handling/cpp_standard
+:link-type: doc
+
+**C++ Standard**
+^^^
+Set minimum C++ standard for using PyTorch.
+
+::::
+
+::::{grid-item-card}
+:link: pytorch_handling/cpp11_abi
+:link-type: doc
+
+**C++11 ABI**
+^^^
+Set correct C++11 ABI for linking.
+
+::::
+
+::::{grid-item-card}
+:link: pytorch_handling/position_independent_code
+:link-type: doc
+
+**Position-Independent Code (PIC)**
+^^^
+Set required PIC flag for linking.
+
+::::
+
+:::::
+
+
+```{toctree}
+:hidden:
+
+pytorch_handling/cpp_standard
+pytorch_handling/cpp11_abi
+pytorch_handling/position_independent_code
+```
diff --git a/docs/src/pytorch_handling/cpp11_abi.md b/docs/src/pytorch_handling/cpp11_abi.md
@@ -0,0 +1,20 @@
+# C++11 ABI
+
+:::{admonition} Hint
+:class: hint
+
+This behavior is exclusive to Linux and the GCC compiler.
+:::
+
+On Linux, the default C++ compiler is typically set to GCC. With the release of GCC 5 in 2017, its standard library introduced several [breaking ABI (Application Binary Interface) changes](https://gcc.gnu.org/wiki/Cxx11AbiCompatibility) between C++11 (and future standards) and C++98. While this may look irrelevant these days, PyTorch is built with great compatibility in mind such that even recent versions support Python's `manylinux2014` platform ([PEP 599](https://peps.python.org/pep-0599/)) which requires compatibility with GCC 4.8.
+
+This, in turn, means that the PyTorch C++ library must adhere to and compile against the **old CXX11 ABI**. Consequently, projects linking against PyTorch also need to specify the respective compiler flag `_GLIBCXX_USE_CXX11_ABI=0`. This even extends to **all** other 3rd-party dependencies and often needs manual adjustment to their CMake scripts. Not meeting these requirements will lead to strange linker errors containing the mangled names of the incompatible C++ standard library classes. 
+
+CharonLoad automatically detects the CXX11 ABI flag used in PyTorch, transitively scans for all dependencies, and patches each of the respective CMake targets to make use of the ABI flag.
+
+
+:::{admonition} Note
+:class: note
+
+Future versions of PyTorch, starting with [PyTorch 2.7](https://github.com/pytorch/pytorch/issues/149044), will switch to the **new CXX11 ABI**. Thus, compilation will work without CharonLoad's target patching.
+:::
diff --git a/docs/src/pytorch_handling/cpp_standard.md b/docs/src/pytorch_handling/cpp_standard.md
@@ -0,0 +1,15 @@
+# C++ Standard
+
+Like many projects, the PyTorch C++ library regularly adapts its minimum required C++ standard version to newer versions in order to benefit from the respective language and standard library improvements. In the past, the C++ standard has been numped for the following releases:
+
+- C++11 &rarr; C++14: PyTorch 1.5 (see [release announcement](https://github.com/pytorch/pytorch/releases/tag/v1.4.0))
+- C++14 &rarr; C++17: PyTorch 2.1 (see [release announcement](https://github.com/pytorch/pytorch/releases/tag/v2.1.0))
+
+Future versions may further increase the used standard to, e.g., C++20.
+
+However, the PyTorch C++ library does not publicly expose this requirement to projects linking against it which may lead to compiler errors in the following cases:
+
+- The project explicitly only needs a lower standard, e.g. `set(CMAKE_CXX_STANDARD 11)` for C++11
+- The project relies on the default standard selected of the compiler, e.g. C++14 for MSVC
+
+CharonLoad automatically detects the minimum required C++ standard in order to use the PyTorch C++ library and sets the corresponding [CMake Compile Feature](https://cmake.org/cmake/help/latest/manual/cmake-compile-features.7.html#requiring-language-standards).
diff --git a/docs/src/pytorch_handling/position_independent_code.md b/docs/src/pytorch_handling/position_independent_code.md
@@ -0,0 +1,7 @@
+# Position-Independent Code (PIC)
+
+Since extension modules are dynamically loaded by the Python interpreter, their actual code will be loaded at runtime and needs to work from any address in memory, i.e. cannot use fixed addresses. While this is automatically ensured during compilation for extension modules (and shared libraries), this requirement is not needed for static libraries which will be directly linked into the executable at compile time and can, in turn, use fixed addresses.
+
+Similar to many C++ projects, developing a C++/CUDA extension may involve the help of external 3rd-party dependencies which may be linked as **static libraries**. However, this may lead to linker errors as code from the dependency cannot be relocated to an arbitrary address.
+
+CharonLoad automatically scans for all transitive *static* dependencies, and patches each of the respective CMake targets to enable [``POSITION_INDEPENDENT_CODE``](<inv:cmake.org#prop_tgt/POSITION_INDEPENDENT_CODE>).