-
Notifications
You must be signed in to change notification settings - Fork 10
Add nvbandwidth sample #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| build-%: DOCKERFILE = $(CURDIR)/deployments/container/Dockerfile.$(DOCKERFILE_SUFFIX) | ||
| else | ||
| build-%: DOCKERFILE = $(CURDIR)/deployments/container/$(SAMPLE)/Dockerfile.$(DOCKERFILE_SUFFIX) | ||
| endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we modify the IMAGE_TAG here, I don't think nvbandwidth-8169f9fa-ubuntu22.04 is a good tag for the nvbandwidth image, maybe we want nvbandwidth-8169f9fa
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need the nvbandwidth and cuda_version tag actually. these images are version sensitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I discussed this with @klueska
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can update the tags to be whatever we want them to be. Please remember that:
- The
VERSIONfor the images released from this repo iscuda12.6.2for example. - The tag should be different for each build (e.g. SHA) so that we can test early access bits.
- The image will be released when tagging the (internal) repo. Currently we tag with
cuda<VERSION>since the base images are the main driver for updates.
This is not a cuda sample. These are standalone memory benchmarking tests. https://github.com/NVIDIA/nvbandwidth |
| - vectorAdd | ||
| - nbody | ||
| - deviceQuery | ||
| - nvbandwidth |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think it belongs here. It is a separate build/Dockerfile. I added another cuda-sample that should go here. See this #18
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does, see the structure of the GitHub action and how Evan is creating a new Make target
It was a typo from Evan's point of view, the GH action will produce |
4db2e70 to
86cce93
Compare
|
/ok-to-test |
|
/ok to test |
1 similar comment
|
/ok to test |
b6b0c77 to
8103ba9
Compare
|
/ok to test |
This change adds an nvbandwidth sample that can be used to test both single and multi-node GPU interconnectivity. The multi-arch images are generated with the following image root: nvcr.io/ghcr.io/nvidia/k8s-samples:nvbandwidth Signed-off-by: Swati Gupta <[email protected]> Signed-off-by: Evan Lezar <[email protected]>
8103ba9 to
6c3323c
Compare
|
Closing in favour of #19 |
These changes add an
nvbandwidthCUDA sample to allow for testing GPU bandwitdth between multiple GPUs.These chages would produce the following images:
docker.io/nvidia/cuda-sample:nvbandwidth-cuda12.6.2-ubuntu22.04docker.io/nvidia/cuda-sample:nvbandwidth-cuda12.6.2