-
Notifications
You must be signed in to change notification settings - Fork 25
Sync changes from rhds:main to rhds:rhoai2.24
#1254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync changes from rhds:main to rhds:rhoai2.24
#1254
Conversation
[RHOAIENG-26264] commit*.env references updates automatically at 11:00 UTC using cron job
When installing python wheels that contain native bits, the Pipfile.lock will only contain artifact hashes for the architecture that `pip lock` was run against, along with the source archive hash. So when installing on a different architecture, pip will attempt to compile from the source archive, and therefore will need the appropriate development files for the native dependencies that are used by the programs that it's compiling. In this case, the h5py python package is needed in the tensorflow images, and to compile the native shared object files that it contains from source, the libhdf5.so file from hdf5-devel is needed. The compiled object files will be dynamically linked to the .so files from hdf5 (so technically the hdf5-devel is only needed at compile time, then hdf5 at runtime, but since the compilation is only done on _some_ architectures, there isn't a dedicated build stage for these python packages, so to try to make minimal changes, the -devel package is left in place. On the architecture that the Pipfile.lock was generated on (x86_64), the native bits are downloaded pre-compiles as before. This makes things a little weird, as on x86_64 we'll have .so files that are precompiled and link to other .so files downloaded from PyPI, whereas on aarch64 we'll have .so files that were compiled as part of the build and linked to other .so files from hdf5 and other RPMs from the system.
For the tensorflow CUDA images, when trying to build them on aarch64, the `pip install` stage fails because of the lack of support for the version of nvidia-nccl-cu12 (2.21.5) requested by tensorflow 2.18. Since it's a proprietary package, there also isn't a source distribution, so it can't just be compiled at installation time. Updating to tensorflow 2.19.0 pulls in a newer nvidia-nccl-cu12 (2.23.4), which does have wheels available for both x86_64 and aarch64 on PyPI.
…ahub-io#1414) This commit introduces tests for ROCm-enabled workbench images on OpenShift. These tests verify that the images can be deployed successfully on a cluster with AMD GPUs and that both PyTorch and TensorFlow can correctly detect the available accelerator. To support the testing of large accelerator images, the following changes were made: - The pod readiness timeout in the test framework has been increased to 10 minutes to allow sufficient time for image pulling. - The utility was updated to allow for configurable timeouts. `ImageDeployment` - Existing CUDA tests were updated to use this new configurable timeout.
…tahub-io#1414) The best fix is to make the SocketProxy more robust so that it doesn't crash when a connection attempt fails. By catching the expected BrokenPipeError, the proxy can simply discard the failed connection and continue listening for the next attempt from the Wait.until loop. This turns your test from a "hope it works" scenario into a reliable polling check.
…atahub-io#1412) Previously, the linux/s390x build would fail to install Podman if Podman was not yet in the GitHub Actions cache. Generalize the non-native architecture build process by using `tonistiigi/binfmt` to install QEMU handlers. This enables building container images for `linux/s390x` and `linux/ppc64le` on amd64 runners. The podman installation step is now also performed for these new platforms. This replaces the previous approach that used `docker/setup-qemu-action` and only supported `s390x`.
…mysql-connector-python This is a followup of the previous PR that bumped this version in Pipfiles only. This change is for manifests to the relevant images so it's then properly grabbed by UI etc.
…to sync-downstream
This is a followup of the recent image update to propagate this upgrade also into the image manifest metadata.
Manual PR to fix the sync from upstream
…o b728be3 Image created from 'https://github.com/opendatahub-io/notebooks?rev=93039d467b1015fad749387ec637e2b2a8f81dec' Signed-off-by: red-hat-konflux <126015336+red-hat-konflux[bot]@users.noreply.github.com>
…mponent-updates/component-update-odh-pipeline-runtime-datascience-cpu-py311-ubi9 chore(deps): update odh-pipeline-runtime-datascience-cpu-py311-ubi9 to b728be3
Enable arm64 for Tensorflow CUDA images
…nflux nudging (opendatahub-io#1424) * Update the params-latest.env with Python 3.12 correct images * Update the commit-latest.env with Python 3.12 correct hashes
Use revision tag on output image
…c3935 Image created from 'https://github.com/opendatahub-io/notebooks?rev=3d2444838c5031d9a8b8c5fadcfe6dbfa3815d1e' Signed-off-by: red-hat-konflux <126015336+red-hat-konflux[bot]@users.noreply.github.com>
…mponent-updates/component-update-odh-pipeline-runtime-minimal-cpu-py312-ubi9 chore(deps): update odh-pipeline-runtime-minimal-cpu-py312-ubi9 to efc3935
…o 5d53c5f Image created from 'https://github.com/opendatahub-io/notebooks?rev=3d2444838c5031d9a8b8c5fadcfe6dbfa3815d1e' Signed-off-by: red-hat-konflux <126015336+red-hat-konflux[bot]@users.noreply.github.com>
…208eb2 Image created from 'https://github.com/opendatahub-io/notebooks?rev=3d2444838c5031d9a8b8c5fadcfe6dbfa3815d1e' Signed-off-by: red-hat-konflux <126015336+red-hat-konflux[bot]@users.noreply.github.com>
…130158 Image created from 'https://github.com/opendatahub-io/notebooks?rev=4cdec0a985b3b7c8033561d8e657dd6a25f550e6' Signed-off-by: red-hat-konflux <126015336+red-hat-konflux[bot]@users.noreply.github.com>
…mponent-updates/component-update-odh-workbench-jupyter-minimal-cuda-py311-ubi9 chore(deps): update odh-workbench-jupyter-minimal-cuda-py311-ubi9 to b9a2972
…mponent-updates/component-update-odh-pipeline-runtime-pytorch-cuda-py311-ubi9 chore(deps): update odh-pipeline-runtime-pytorch-cuda-py311-ubi9 to 9130158
…6ca1971 Image created from 'https://github.com/opendatahub-io/notebooks?rev=4cdec0a985b3b7c8033561d8e657dd6a25f550e6' Signed-off-by: red-hat-konflux <126015336+red-hat-konflux[bot]@users.noreply.github.com>
…ebe0f2 Image created from 'https://github.com/opendatahub-io/notebooks?rev=2663f3b4a2784044a5df7eb5794e242be83d4d7a' Signed-off-by: red-hat-konflux <126015336+red-hat-konflux[bot]@users.noreply.github.com>
…485d52c Image created from 'https://github.com/opendatahub-io/notebooks?rev=4cdec0a985b3b7c8033561d8e657dd6a25f550e6' Signed-off-by: red-hat-konflux <126015336+red-hat-konflux[bot]@users.noreply.github.com>
…mponent-updates/component-update-odh-workbench-jupyter-trustyai-cpu-py311-ubi9 chore(deps): update odh-workbench-jupyter-trustyai-cpu-py311-ubi9 to 485d52c
…mponent-updates/component-update-odh-workbench-jupyter-pytorch-cuda-py311-ubi9 chore(deps): update odh-workbench-jupyter-pytorch-cuda-py311-ubi9 to 6ca1971
…mponent-updates/component-update-odh-pipeline-runtime-pytorch-cuda-py312-ubi9 chore(deps): update odh-pipeline-runtime-pytorch-cuda-py312-ubi9 to 2ebe0f2
…to e73f814 Image created from 'https://github.com/opendatahub-io/notebooks?rev=4cdec0a985b3b7c8033561d8e657dd6a25f550e6' Signed-off-by: red-hat-konflux <126015336+red-hat-konflux[bot]@users.noreply.github.com>
…to 7b6f6f3 Image created from 'https://github.com/opendatahub-io/notebooks?rev=2663f3b4a2784044a5df7eb5794e242be83d4d7a' Signed-off-by: red-hat-konflux <126015336+red-hat-konflux[bot]@users.noreply.github.com>
…o 5bfdec2 Image created from 'https://github.com/opendatahub-io/notebooks?rev=4cdec0a985b3b7c8033561d8e657dd6a25f550e6' Signed-off-by: red-hat-konflux <126015336+red-hat-konflux[bot]@users.noreply.github.com>
…mponent-updates/component-update-odh-workbench-jupyter-tensorflow-cuda-py311-ubi9 chore(deps): update odh-workbench-jupyter-tensorflow-cuda-py311-ubi9 to e73f814
…6f43e71 Image created from 'https://github.com/opendatahub-io/notebooks?rev=2663f3b4a2784044a5df7eb5794e242be83d4d7a' Signed-off-by: red-hat-konflux <126015336+red-hat-konflux[bot]@users.noreply.github.com>
…mponent-updates/component-update-odh-workbench-jupyter-tensorflow-cuda-py312-ubi9 chore(deps): update odh-workbench-jupyter-tensorflow-cuda-py312-ubi9 to 7b6f6f3
…mponent-updates/component-update-odh-pipeline-runtime-tensorflow-cuda-py311-ubi9 chore(deps): update odh-pipeline-runtime-tensorflow-cuda-py311-ubi9 to 5bfdec2
…mponent-updates/component-update-odh-workbench-jupyter-pytorch-cuda-py312-ubi9 chore(deps): update odh-workbench-jupyter-pytorch-cuda-py312-ubi9 to 6f43e71
…6d8afa6 Image created from 'https://github.com/opendatahub-io/notebooks?rev=4cdec0a985b3b7c8033561d8e657dd6a25f550e6' Signed-off-by: red-hat-konflux <126015336+red-hat-konflux[bot]@users.noreply.github.com>
…mponent-updates/component-update-odh-workbench-jupyter-minimal-rocm-py311-ubi9 chore(deps): update odh-workbench-jupyter-minimal-rocm-py311-ubi9 to 6d8afa6
…4c99152 Image created from 'https://github.com/opendatahub-io/notebooks?rev=2663f3b4a2784044a5df7eb5794e242be83d4d7a' Signed-off-by: red-hat-konflux <126015336+red-hat-konflux[bot]@users.noreply.github.com>
…6a637f Image created from 'https://github.com/opendatahub-io/notebooks?rev=2663f3b4a2784044a5df7eb5794e242be83d4d7a' Signed-off-by: red-hat-konflux <126015336+red-hat-konflux[bot]@users.noreply.github.com>
…mponent-updates/component-update-odh-pipeline-runtime-pytorch-rocm-py312-ubi9 chore(deps): update odh-pipeline-runtime-pytorch-rocm-py312-ubi9 to 06a637f
…mponent-updates/component-update-odh-workbench-jupyter-minimal-rocm-py312-ubi9 chore(deps): update odh-workbench-jupyter-minimal-rocm-py312-ubi9 to 4c99152
Sync `rhds:main` from `odh:main`
…ooks into sync-release-2.24
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
4099309
into
red-hat-data-services:rhoai-2.24
Description
Sync changes from
rhds:maintorhds:rhoai2.24How Has This Been Tested?
Merge criteria: