-
Notifications
You must be signed in to change notification settings - Fork 26
Add prgenv-gnu-openmpi with OpenMPI #263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
msimberg
wants to merge
48
commits into
eth-cscs:main
Choose a base branch
from
msimberg:prgenv-gnu-ompi
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from 36 commits
Commits
Show all changes
48 commits
Select commit
Hold shift + click to select a range
a8ccde1
Add test recipe for next version of prgenv-gnu with upsteam libfabric…
msimberg 466a169
Add prgenv-gnu/next to config
msimberg 0ea7a0e
No zen for prgenv-gnu/next
msimberg ab2e9bb
Use special branch of alps-cluster-config
msimberg 1ae291c
Disable xpmem kernel-module
msimberg f2b9f01
Use GCC 14.2 again
msimberg e80770d
Use libfabric 2.3
msimberg 7cfa922
Add +gdrcopy variant to libfabric in prgenv-gnu/next
msimberg 7ae62d1
Don't exclude gcc module
msimberg 04027f8
Don't use custom alps-cluster-config
msimberg e42a796
Move prgenv-gnu/next to prgenv-gnu/25.11
msimberg fb5a928
Remove custom xpmem from prgenv-gnu
msimberg 25124f1
Add prgenv-gnu/25.11 mc recipe
msimberg c649e60
Try gcc 14.3 again for prgenv-gnu/25.11
msimberg 46675fb
Add openmpi view to prgenv-gnu/25.11
msimberg c769617
Use gcc 14.2 again in prgenv-gnu/25.11
msimberg ee2d15e
Merge branch 'prgenv-gnu-next' into prgenv-gnu-ompi
msimberg d1b83c6
Move openmpi environment to separate uenv
msimberg f84670a
Fix config.yaml
msimberg d9ea85b
No openmpi on eiger, for now...
msimberg e3647d7
Rename openmpi view
msimberg d055ea7
Add gmp to prgenv-gnu-openmpi/25.11
msimberg 5d76a78
Add openmpi feature to prgenv-gnu-openmpi reframe metadata
msimberg 4610066
Add nccl reframe feature to prgenv-gnu-openmpi/25.11
msimberg 8fb2658
Add netcdf-cxx4 to to prgenv-gnu-openmpi/25.11
msimberg 780be85
Update libfabric spec
msimberg cea6437
Add patch for GPU-GPU communication with lnx in libfabric
msimberg d8c2054
Merge remote-tracking branch 'origin/main' into prgenv-gnu-ompi
msimberg 015c11e
Try libfabric with system cray-xpmem
msimberg fa4d7ed
Upgrade spack-packages for openmpi 5.0.9
msimberg aaba38a
Disable libfabric patch temporarily
msimberg 1fc7a38
Pin cuda to 12
msimberg f530498
Use [email protected]
msimberg d08b5ee
Update recipes/prgenv-gnu-openmpi/25.11/gh200/config.yaml
msimberg d151291
Update recipes/prgenv-gnu-openmpi/25.11/gh200/config.yaml
msimberg b3bba95
Merge remote-tracking branch 'origin/main' into prgenv-gnu-ompi
msimberg 0f39f0a
Remove extra prgenv-gnu config entry
msimberg 5f720c2
Merge remote-tracking branch 'origin/main' into prgenv-gnu-ompi
msimberg 637aaa9
Remove custom repo for openmpi recipe
msimberg ca98862
Fix name of openmpi uenv
msimberg 9639f29
Use simpler network spec for openmpi recipe
msimberg 5a8fcb2
Add mc recipe for openmpi
msimberg c946017
Bump spack-packages for openmpi recipe
msimberg f9b9466
Try lifting compiler restriction in gh200 recipe
msimberg 6535c10
Move openmpi recipe to 25.12
msimberg 3f5cfcd
Enable openmpi recipe for eiger
msimberg e3ad543
Remove repo link in openmpi recipe
msimberg f847ed6
Fix config paths
msimberg File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| gcc: | ||
| version: "14.2" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| name: prgenv-gnu | ||
| spack: | ||
| repo: https://github.com/spack/spack.git | ||
| commit: releases/v1.1 | ||
| packages: | ||
| repo: https://github.com/spack/spack-packages.git | ||
| commit: a896fdbe5d01981cbc6f9b5139a5d551ac2fe248 # develop on 2025-11-21 | ||
| store: /user-environment | ||
| description: GNU Compiler toolchain with OpenMPI, Python, CMake and other development tools. | ||
| version: 2 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,51 @@ | ||
| gcc-env: | ||
| compiler: [gcc] | ||
| network: | ||
| mpi: [email protected] | ||
msimberg marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| specs: | ||
| - [email protected] +gdrcopy fabrics=lnx,shm,cxi,xpmem # TODO: Add to/update alps-cluster-config | ||
| unify: true | ||
| specs: | ||
| - boost +chrono +filesystem +iostreams +mpi +python +regex +serialization +shared +system +timer | ||
| - cmake | ||
| - fftw | ||
| - fmt | ||
| - gmp | ||
| - gsl | ||
| - hdf5+cxx+hl+fortran | ||
| - kokkos +aggressive_vectorization ~alloc_async cuda_arch=90 +cuda_constexpr +cuda_lambda ~cuda_relocatable_device_code ~cuda_uvm cxxstd=17 +openmp +pic +serial +shared +tuning +wrapper | ||
| - kokkos-kernels +blas +cublas +cusparse +cusolver +execspace_cuda +execspace_openmp +execspace_serial +lapack +memspace_cudaspace +openmp scalars=float,double,complex_float,complex_double +serial +shared +superlu | ||
| - kokkos-tools +mpi +papi | ||
| - netlib-scalapack | ||
| - lua | ||
| - libtree | ||
| - lz4 | ||
| - meson | ||
| - netcdf-c | ||
| - netcdf-cxx | ||
| - netcdf-cxx4 | ||
| - netcdf-fortran | ||
| - ninja | ||
| - openblas threads=openmp | ||
| - osu-micro-benchmarks | ||
| - papi | ||
| - python | ||
| - zlib-ng | ||
| # add GPU-specific packages here, for easier comparison with mc version | ||
| - nccl | ||
| - nccl-tests | ||
| - cuda@12 | ||
| - xcb-util-cursor | ||
| - aws-ofi-nccl | ||
| - superlu | ||
| variants: | ||
| - +mpi | ||
| - +cuda | ||
| - cuda_arch=90a | ||
| views: | ||
| default: | ||
| link: roots | ||
| uenv: | ||
| add_compilers: true | ||
| prefix_paths: | ||
| LD_LIBRARY_PATH: [lib, lib64] | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| default: | ||
| features: | ||
| - cuda | ||
| - mpi | ||
| - openmpi | ||
| - nccl | ||
| - nccl-tests | ||
| - openmp | ||
| - osu-micro-benchmarks | ||
| - prgenv | ||
| - serial | ||
| cc: mpicc | ||
| cxx: mpic++ | ||
| ftn: mpifort | ||
| views: | ||
| - default |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,23 @@ | ||
| modules: | ||
| # Paths to check when creating modules for all module sets | ||
| prefix_inspections: | ||
| bin: | ||
| - PATH | ||
| lib: | ||
| - LD_LIBRARY_PATH | ||
| lib64: | ||
| - LD_LIBRARY_PATH | ||
|
|
||
| default: | ||
| arch_folder: false | ||
| # Where to install modules | ||
| roots: | ||
| tcl: /user-environment/modules | ||
| tcl: | ||
| all: | ||
| autoload: none | ||
| hash_length: 0 | ||
| exclude_implicits: true | ||
| exclude: [] | ||
| projections: | ||
| all: '{name}/{version}' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| ../repo/ |
20 changes: 20 additions & 0 deletions
20
recipes/prgenv-gnu-openmpi/25.11/repo/packages/libfabric/issue-11231-cuda-sync.patch
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| diff --git a/prov/lnx/src/lnx_ops.c b/prov/lnx/src/lnx_ops.c | ||
| index ba3f097..2d6c187 100644 | ||
| --- a/prov/lnx/src/lnx_ops.c | ||
| +++ b/prov/lnx/src/lnx_ops.c | ||
| @@ -455,6 +455,7 @@ ssize_t lnx_trecv(struct fid_ep *ep, void *buf, size_t len, void *desc, | ||
| struct lnx_ep *lep; | ||
| const struct iovec iov = {.iov_base = buf, .iov_len = len}; | ||
|
|
||
| + cuda_set_sync_memops(buf); | ||
| lep = container_of(ep, struct lnx_ep, le_ep.ep_fid.fid); | ||
| if (!lep) | ||
| return -FI_ENOSYS; | ||
| @@ -666,6 +667,7 @@ ssize_t lnx_tsenddata(struct fid_ep *ep, const void *buf, size_t len, void *desc | ||
| fi_addr_t core_addr; | ||
| void *core_desc = desc; | ||
|
|
||
| + cuda_set_sync_memops(buf); | ||
| lep = container_of(ep, struct lnx_ep, le_ep.ep_fid.fid); | ||
| if (!lep) | ||
| return -FI_ENOSYS; |
19 changes: 19 additions & 0 deletions
19
recipes/prgenv-gnu-openmpi/25.11/repo/packages/libfabric/package.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,19 @@ | ||
| from spack_repo.builtin.packages.libfabric.package import Libfabric as BuiltinLibfabric | ||
|
|
||
| from spack.package import * | ||
|
|
||
| class Libfabric(BuiltinLibfabric): | ||
msimberg marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| # This patches missing synchronization for GPU-GPU transfers in the lnx | ||
| # provider of libfabric. The patch is from from a comment on the | ||
| # corresponding issue: | ||
| # https://github.com/ofiwg/libfabric/issues/11231#issue-3252163450. | ||
| # | ||
| # It's unclear if this is a good patch, but it's sufficient for testing of | ||
| # the lnx provider. If and when the correct fix is published the patch can | ||
| # be backported on the upstream libfabric package. | ||
| # | ||
| # The patch may not apply for all versions (tested with 2.3.1), but there | ||
| # is no version constraint as the patch is essential. Builds should fail if | ||
| # the patch doesn't apply. | ||
| # patch("issue-11231-cuda-sync.patch", when="fabrics=lnx") | ||
| pass | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To do: test an eiger build.