Apply changes

matthewfeickert · matthewfeickert · commit c3e0d4574ca2 · 2025-08-01T19:50:31.000-06:00
diff --git a/papers/matthew_feickert/acknowledgements.md b/papers/matthew_feickert/acknowledgements.md
@@ -3,4 +3,4 @@
 Matthew Feickert is supported by the U.S. National Science Foundation (NSF) under Cooperative Agreement PHY-2323298 (IRIS-HEP) and by the US Research Software Sustainability Institute (URSSI) via grant G-2022-19347 from the Sloan Foundation.
 Ruben Arts is supported by prefix.dev GmbH.
 John Kirkham is supported by NVIDIA.
-The described tutorial was created in association with [@Feickert_Reproducible_Machine_Learning].
+The described work [@reproducible_machine_learning_scipy_2025_tutorial] was created in association with [@Feickert_Reproducible_Machine_Learning].
diff --git a/papers/matthew_feickert/code/ml-example/Dockerfile b/papers/matthew_feickert/code/ml-example/Dockerfile
@@ -0,0 +1,31 @@
+ARG CUDA_VERSION="12"
+ARG ENVIRONMENT="gpu"
+
+FROM ghcr.io/prefix-dev/pixi:noble AS build
+
+# Redeclaring ARGS in a stage without a value inherits the global default
+ARG CUDA_VERSION
+ARG ENVIRONMENT
+
+WORKDIR /app
+COPY . .
+ENV CONDA_OVERRIDE_CUDA=$CUDA_VERSION
+RUN pixi install --locked --environment $ENVIRONMENT
+RUN echo "#!/bin/bash" > /app/entrypoint.sh && \
+    pixi shell-hook --environment $ENVIRONMENT -s bash >> /app/entrypoint.sh && \
+    echo 'exec "$@"' >> /app/entrypoint.sh
+
+FROM ghcr.io/prefix-dev/pixi:noble AS final
+
+ARG ENVIRONMENT
+
+WORKDIR /app
+COPY --from=build /app/.pixi/envs/$ENVIRONMENT /app/.pixi/envs/$ENVIRONMENT
+COPY --from=build /app/pixi.toml /app/pixi.toml
+COPY --from=build /app/pixi.lock /app/pixi.lock
+# The ignore files are needed for 'pixi run' to work in the container
+COPY --from=build /app/.pixi/.gitignore /app/.pixi/.gitignore
+COPY --from=build /app/.pixi/.condapackageignore /app/.pixi/.condapackageignore
+COPY --from=build --chmod=0755 /app/entrypoint.sh /app/entrypoint.sh
+
+ENTRYPOINT [ "/app/entrypoint.sh" ]
diff --git a/papers/matthew_feickert/example.lock b/papers/matthew_feickert/example.lock
@@ -0,0 +1,57 @@
+version: 6
+environments:
+  cpu:
+    channels:
+    - url: https://conda.anaconda.org/conda-forge/
+    packages:
+      linux-64:
+
+...
+
+      - conda: https://conda.anaconda.org/conda-forge/linux-64/python-3.13.5-hec9711d_102_cp313.conda
+      - conda: https://conda.anaconda.org/conda-forge/noarch/python_abi-3.13-8_cp313.conda
+      - conda: https://conda.anaconda.org/conda-forge/linux-64/pytorch-2.7.1-cpu_mkl_py313_h58dab0e_103.conda
+      - conda: https://conda.anaconda.org/conda-forge/linux-64/pytorch-cpu-2.7.1-cpu_mkl_hc60beec_103.conda
+
+...
+
+  gpu:
+    channels:
+    - url: https://conda.anaconda.org/conda-forge/
+    packages:
+      linux-64:
+
+...
+
+      - conda: https://conda.anaconda.org/conda-forge/linux-64/cuda-nvcc-tools-12.9.86-he02047a_2.conda
+      - conda: https://conda.anaconda.org/conda-forge/linux-64/cuda-nvdisasm-12.9.88-hbd13f7d_0.conda
+      - conda: https://conda.anaconda.org/conda-forge/linux-64/cuda-nvrtc-12.9.86-h5888daf_0.conda
+      - conda: https://conda.anaconda.org/conda-forge/linux-64/cuda-nvtx-12.9.79-h5888daf_0.conda
+      - conda: https://conda.anaconda.org/conda-forge/linux-64/cuda-nvvm-tools-12.9.86-h4bc722e_2.conda
+      - conda: https://conda.anaconda.org/conda-forge/noarch/cuda-version-12.9-h4f385c5_3.conda
+      - conda: https://conda.anaconda.org/conda-forge/linux-64/cudnn-9.10.1.4-hbcb9cd8_1.conda
+
+...
+
+      - conda: https://conda.anaconda.org/conda-forge/linux-64/python-3.13.5-hec9711d_102_cp313.conda
+      - conda: https://conda.anaconda.org/conda-forge/noarch/python_abi-3.13-8_cp313.conda
+      - conda: https://conda.anaconda.org/conda-forge/linux-64/pytorch-2.7.1-cuda129_mkl_py313_h1e53aa0_304.conda
+      - conda: https://conda.anaconda.org/conda-forge/linux-64/pytorch-gpu-2.7.1-cuda129_mkl_h43a4b0b_304.conda
+
+...
+
+packages:
+
+...
+
+- conda: https://conda.anaconda.org/conda-forge/linux-64/pytorch-gpu-2.7.1-cuda129_mkl_h43a4b0b_304.conda
+  sha256: af54e6535619f4e484d278d015df6ea67622e2194f78da2c0541958fc3d83d18
+  md5: e374ee50f7d5171d82320bced8165e85
+  depends:
+  - pytorch 2.7.1 cuda*_mkl*304
+  license: BSD-3-Clause
+  license_family: BSD
+  size: 48008
+  timestamp: 1753886159800
+
+...
diff --git a/papers/matthew_feickert/linux-containers.md b/papers/matthew_feickert/linux-containers.md
@@ -0,0 +1,24 @@
+## Deploying environments to remote compute
+
+Often researchers are running scientific and machine learning workflows on remote computational resources that use batch computing systems (e.g. HTCondor, SLURM).
+Some of these systems may not have shared filesystems, requiring that each worker node receive its own copy of the software environment.
+While locked Pixi environments significantly help with this, it is often advantageous to distribute the environment in the form of a Linux container image to the compute resources.
+These systems are able to mount Linux container images to worker nodes in ways that reduce the disk and memory cost to the user's session, compared to installing Pixi and then downloading all dependencies of the software environment from the package indexes used.
+This also reduces the bandwidth use as the Linux container image can be cached at the compute resource host and efficiently replicated to the worker nodes, paying the bandwidth cost of download once.
+While Linux container technology historically has presented additional engineering and design overhead to researchers, Linux container construction of Pixi environments is simple and can be reduced to templated format.
+An example in the form of a templated Dockerfile is seen in @example-pixi-dockerfile.[^docker_footnote]
+The template requires user input to define the target CUDA version (`CUDA_VERSION`) and the name of the Pixi environment to install (`ENVIRONMENT`).
+As the Pixi environment is already fully defined and locked it can be directly installed as normal in the `build` stage of the container image build, along with an entrypoint shell script that will activate the environment, and then copied from the `build` stage into the `final` stage to reduce the total image size by removing the cache and reducing the total number of layers in the final image.
+
+```{literalinclude} code/ml-example/Dockerfile
+:label: example-pixi-dockerfile
+:caption: The template structure of a Dockerfile for a locked Pixi environment with CUDA dependencies. The only values that need user input are the CUDA version and the name of the target environment.
+```
+
+The Dockerfile can then be built into a Linux container image binary file which can be distributed to a container image registry.
+Batch computing system workflow definition files can use these container images to provide the software environment for the computing jobs, which pull the images from the container image registry when requested by the job.
+
+[^docker_footnote]: As many compute facilities do not allow for use of Docker directly given security concerns, Apptainer container image formats are more common.
+Apptainer definition files are similarly easy to write as compared to Dockerfiles and Docker container images can be converted into a format that Apptainer can use.
+As Docker is a more common format in the broader computing world, including commercial settings, it has been used for this example.
+These workflows are not limited to a single container image format.
diff --git a/papers/matthew_feickert/main.md b/papers/matthew_feickert/main.md
@@ -4,7 +4,7 @@ title: Reproducible Machine Learning Workflows for Scientists with Pixi
 abstract: |
   Scientific researchers need reproducible software environments for complex applications that can run across heterogeneous computing platforms.
   Modern open source tools, like Pixi, provide automatic reproducibility solutions for all dependencies while providing a high level interface well suited for researchers.
-  Combined with the recent emergence of the entire CUDA software stack &mdash; from compilers to development libraries &mdash; being supported on conda-forge, researchers are now able to easily specify their exact hardware acceleration requirements and software dependencies and get portable computational environments locked down to the hash level.
+  Combined with the recent emergence of the entire CUDA software stack &mdash; from compilers to development libraries &mdash; being supported on conda-forge, researchers are now able to easily specify their exact hardware acceleration requirements and software dependencies and get portable computational environments locked down to the digest level.
 ---
 
 :::{include} introduction.md
@@ -19,5 +19,11 @@ abstract: |
 :::{include} pixi.md
 :::
 
+:::{include} linux-containers.md
+:::
+
+:::{include} summary.md
+:::
+
 :::{include} acknowledgements.md
 :::
diff --git a/papers/matthew_feickert/mybib.bib b/papers/matthew_feickert/mybib.bib
@@ -2,13 +2,13 @@ @software{pixi
 author = {Arts, Ruben and Zalmstra, Bas and Vollprecht, Wolf and de Jager, Tim and Morcotilo, Nichita and Hofer, Julian},
 license = {BSD-3-Clause},
 title = {{pixi}},
-url = {https://github.com/prefix-dev/pixi/releases/tag/v0.49.0}
+url = {https://github.com/prefix-dev/pixi/releases/tag/v0.50.2}
 }
 
 @misc{pixi-docs,
   author = {Arts, Ruben and Zalmstra, Bas and Vollprecht, Wolf and de Jager, Tim and Morcotilo, Nichita and Hofer, Julian},
   title = "{Pixi Documentation}",
-  howpublished = "\url{https://pixi.sh/v0.49.0/}",
+  howpublished = "\url{https://pixi.sh/v0.50.2/}",
 }
 
 @misc{CUDA_slides,
@@ -69,6 +69,14 @@ @software{Feickert_Reproducible_Machine_Learning
 year = {2025}
 }
 
+@software{reproducible_machine_learning_scipy_2025_tutorial,
+author = {Feickert, Matthew and Arts, Ruben and Kirkham, John},
+doi = {10.5281/zenodo.16320203},
+license = {BSD-3-Clause},
+title = {{Reproducible Machine Learning Workflows for Scientists with Pixi}},
+url = {https://github.com/matthewfeickert-talks/reproducible-ml-for-scientists-with-pixi-scipy-2025/releases/tag/scipy-2025}
+}
+
 @misc{conda-forge_github_io_issue_687,
   title = "{conda-forge.github.io Issue 687: How to specify CUDA version in a conda package?}",
   howpublished = "\url{https://github.com/conda-forge/conda-forge.github.io/issues/687}",
@@ -101,3 +109,10 @@ @misc{ucx-split-feedstock-pr-14
   howpublished = "\url{https://github.com/conda-forge/ucx-split-feedstock/pull/14}",
   year = {2019}
 }
+
+@software{pixi-pack,
+author = {Zwerschke, Pavel and Elsner, Daniel and Stoyan, Bela},
+license = {BSD-3-Clause},
+title = {{pixi-pack}},
+url = {https://github.com/Quantco/pixi-pack/releases/tag/v0.7.2}
+}
diff --git a/papers/matthew_feickert/pixi.md b/papers/matthew_feickert/pixi.md
@@ -30,10 +30,12 @@ This declarative nature allows for users to efficiently specify their project re
 ### CUDA hardware accelerated environment creation
 
 Combining the features of modern CUDA `12` conda packages with Pixi's environment management, it is now possible to efficiently manage multiple software environments that can include both hardware accelerated and CPU environments.
-An example Pixi workspace is presented below
+An example Pixi workspace is presented in @pixi-ml-example-workspace
 
 ```{literalinclude} code/ml-example/pixi.toml
 :linenos:
+:label: pixi-ml-example-workspace
+:caption: Example of a multi-platform and multi-environment Pixi manifest with all required information and constraints to resolve and install CUDA accelerated conda packages.
 ```
 
 where the definition of multiple platforms allows for solving the declared environments for all platforms while on other platforms
@@ -47,6 +49,132 @@ where the definition of multiple platforms allows for solving the declared envir
 
 the `cpu` feature defines `dependencies` and `tasks` that are accessible from the `cpu` environment
 
-```{literalinclude} code/ml-example/pixi.toml
-:lines: 12-19, 33-35
+```{code} toml
+:filename: pixi.toml
+
+...
+
+[feature.cpu.dependencies]
+pytorch-cpu = ">=2.7.1,<3"
+torchvision = ">=0.22.0,<0.23"
+
+[feature.cpu.tasks.train-cpu]
+description = "Train MNIST on CPU"
+cmd = "python src/torch_MNIST.py --epochs 2 --save-model --data-dir data"
+
+...
+
+[environments]
+cpu = ["cpu"]
+```
+
+The `gpu` feature does the same for the `gpu` environment, but it also importantly defines a [`system-requirements` table](https://pixi.sh/v0.50.2/workspace/system_requirements/) that define the system specifications needed to install and run a Pixi workspace's environments.
+
+
+```{code} toml
+:filename: pixi.toml
+
+...
+
+[feature.gpu.system-requirements]
+cuda = "12"
+
+[feature.gpu.target.linux-64.dependencies]
+pytorch-gpu = ">=2.7.1,<3"
+torchvision = ">=0.22.0,<0.23"
+
+[feature.gpu.tasks.train-gpu]
+description = "Train MNIST on GPU"
+cmd = "python src/torch_MNIST.py --epochs 14 --save-model --data-dir data"
+
+...
+
+[environments]
+...
+gpu = ["gpu"]
+```
+
+`system-requirements` build upon the concept of conda "[virtual packages](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-virtual.html)", allowing for the dependency resolver to enforce constraints declared by defining compatibility of the system with virtual packages, like `__cuda`.
+In the particular case of CUDA, the `system-requirements` table specifies the CUDA version the workspace expects the host system to support, as detected through the host system's NVIDIA driver API.
+While the `system-requirements` field values do not correspond to lower or upper bounds, specifying that the workspace is expected to work on systems that support CUDA 12
+
+```{code} toml
+:filename: pixi.toml
+
+...
+
+[feature.gpu.system-requirements]
+cuda = "12"
+
+...
+
+```
+
+ensures that packages depending on `__cuda >= 12` are resolved correctly.
+This effectively means that declaring the system requirement will cause the Pixi dependency resolver to find CUDA enabled packages that are compatible with CUDA 12, disallowing for incompatible package builds to be resolved.
+Once these package dependencies have been resolved and locked, this ensures that any system capable of meeting the system requirement will get working CUDA accelerated conda packages installed.
+
+Not all machines will have an NVIDIA GPU on them to allow for the system requirements to be resolved correctly.
+To allow for non-CUDA-supported-machines to still resolve Pixi workspace requirements, shell environment overrides exist through the `CONDA_OVERRIDE_CUDA` environmental variable.
+Setting `CONDA_OVERRIDE_CUDA=12` on a machine that doesn't meet the CUDA version requirements, will override the supported virtual packages and set a value of `__cuda=12` for the system.
+This can be clearly understood from setting the override and then querying the workspace summary with `pixi info`, as seen in @conda-override-cuda-example.
+This is a powerful functionality as it allows for environment specification, resolution, and locking for target platforms that users might not have access to, but can be assured are valid.
+
+```{code} console
+:label: conda-override-cuda-example
+:caption: Demonstration of using the `CONDA_OVERRIDE_CUDA` environmental variable on a system with no CUDA support (an Apple silicon machine) to allow dependency resolution as if it supported CUDA 12.
+
+% pixi info
+System
+------------
+       Pixi version: 0.50.2
+           Platform: osx-arm64
+   Virtual packages: __unix=0=0
+                   : __osx=15.3.2=0
+                   : __archspec=1=m2
+...
+
+% CONDA_OVERRIDE_CUDA=12 pixi info
+System
+------------
+       Pixi version: 0.50.2
+           Platform: osx-arm64
+   Virtual packages: __unix=0=0
+                   : __osx=15.3.2=0
+                   : __cuda=12=0
+                   : __archspec=1=m2
+...
+```
+
+Pixi also allows for feature composition to efficiently create new environments.
+@pixi-ml-example-workspace's `gpu` and `inference` features are combined and resolved collectively to provide a new CUDA accelerated `inference` environment that does not affect the `gpu` environment.
+
+```{code} toml
+:filename: pixi.toml
+
+...
+
+[feature.inference.dependencies]
+matplotlib = ">=3.10.3,<4"
+
+[environments]
+...
+gpu = ["gpu"]
+inference = ["gpu", "inference"]
+```
+
+### Locked environments
+
+Once the workspace has been defined, any Pixi operation on the workspace will result in all environments in the workspace having their dependencies resolved and then fully specified ("locked") at the digest ("hash") level in a single `pixi.lock` Pixi lock file, as seen in @example-pixi-lockfile.
+The lock file is a YAML file that contains two definition groups: `environments` and `packages`.
+The `environments` group lists every environment in the workspace for every platform with a complete listing of all packages in the environment.
+The `packages` group lists a full definition of every package that appears in the `environments` lists, including the package's URL and digests (e.g. sha256, md5).
+These groups provide a full description of every package described in the Pixi workspace and its dependencies and constraints on other packages.
+Versioning the lock file along with the manifest file in a version control system allows for workspaces to be fully reproducible to the byte level indefinitely into the future, conditioned on the continued existence of the package indexes the workspace pulls from (e.g. conda-forge, PyPI, the nvidia conda channel).
+In the event that long term preservation and reproducibility are of importance, there are community projects [@pixi-pack] that allow for downloading all dependencies of a Pixi environment and generating a tar archive containing all of the packages, which can later be unpacked and installed.
+
+```{literalinclude} example.lock
+:filename: pixi.lock
+:label: example-pixi-lockfile
+:caption: Example structure of a `pixi.lock` Pixi lock file showing the definition of the environments as well as a full description of each package used in each environment.
 ```
diff --git a/papers/matthew_feickert/summary.md b/papers/matthew_feickert/summary.md
@@ -0,0 +1,16 @@
+## Summary
+
+As hardware accelerated code becomes more common across scientific computing, especially CUDA accelerated software for machine learning, the need for simple but powerful solutions for software environment management has grown too.
+The simple and flexible structure of conda packages allows for complex projects to be packaged as directory trees of built binaries on a platform specific level.
+This has allowed for the complexity of the CUDA software stack to be efficiently built as conda packages using the conda-forge cyberinfrastructure and then distributed on the conda-forge conda channel for public use.
+Distribution of CUDA conda packages on conda-forge additionally allows for other conda-forge projects to use the CUDA conda packages in their builds, resulting in a wide selection of CUDA enabled projects, including many machine learning packages.
+Through use of Pixi's declarative specification of dependencies in the project manifest and non-optional digest level lock file generation, software environments can now be declaratively and rapidly constructed, resolved, and locked using semantic operations well designed for scientific researchers.
+With these powerful technologies and abstractions, researchers can now construct machine learning and data science environments for multiple platforms at once and use trusted patterns to develop locally and deploy to remote computational resources.
+
+In addition to the long term reproducibility provided by the combination of these technologies, the maintenance burden and complexity reduction should not be overlooked.
+With the CUDA v12 distributions on conda-forge, researches no longer need to have experience in CUDA internals and distribution installation to accelerate their software projects.
+They need only know the supported versions of CUDA by the NVIDIA drivers on their target machines.
+Researchers also no longer need to use multiple tools build bespoke workflows for constructing and maintaining lock files for multiple environments and platforms, while keeping environment definition files and lock files synced.
+Pixi provides a single tool and unified interface to achieve the same results faster while using high level abstractions &mdash; removing most of the work of software environment reproducibility from the user workflow.
+Having the full specification of the software environment including the CUDA dependencies also removes runtime failures due to missing, unspecified, or incompatible system-level requirements on remote compute resources.
+Most importantly, reducing cognitive overhead and the latency to reach a useable software environment reduces the time to insight for researchers, transferring the problems of scientific computing back into their domains of expertise.