Skip to content

CI

CI #5028

Triggered via schedule November 11, 2025 09:34
Status Failure
Total duration 4h 0m 8s
Artifacts 55

ci.yaml

on: schedule
metadata
3s
metadata
bump-manifest
18s
bump-manifest
Matrix: amd64 / test-distribution
Matrix: arm64 / test-distribution
amd64  /  ...  /  build-base
3m 1s
amd64 / build-base / build-base
arm64  /  ...  /  build-base
3m 5s
arm64 / build-base / build-base
amd64  /  ...  /  build-mpi-operator-compatible-base
1m 32s
amd64 / test-nccl / build-mpi-operator-compatible-base
amd64  /  ...  /  build-nccl-gke
1m 54s
amd64 / test-nccl / nccl-test-gke / build-nccl-gke
arm64  /  ...  /  build-mpi-operator-compatible-base
arm64 / test-nccl / build-mpi-operator-compatible-base
arm64  /  ...  /  build-nccl-gke
arm64 / test-nccl / nccl-test-gke / build-nccl-gke
Matrix: amd64 / test-jax-cutlass-h100 / jax-cutlass-test-h100
Matrix: amd64 / test-jax / run-unit-test
Matrix: amd64 / test-te-a100 / run-unit-test
Matrix: amd64 / test-te-h100 / te-test-h100
amd64  /  build-torchax
8m 32s
amd64 / build-torchax
amd64  /  ...  /  launch-slurm-runner
28m 24s
amd64 / test-jax / runner / launch-slurm-runner
amd64  /  test-nsys-jax-eks
4m 0s
amd64 / test-nsys-jax-eks
amd64  /  ...  /  launch-slurm-runner
2h 27m
amd64 / test-te-a100 / runner / launch-slurm-runner
amd64  /  build-upstream-t5x
8m 55s
amd64 / build-upstream-t5x
Matrix: amd64 / test-nsys-jax / run-unit-test
amd64  /  ...  /  launch-slurm-runner
2h 40m
amd64 / test-nsys-jax / runner / launch-slurm-runner
Matrix: amd64 / test-nccl / nccl-test
Matrix: amd64 / test-nccl / nccl-test-gke / nccl-gke
Matrix: arm64 / test-jax-cutlass-h100 / jax-cutlass-test-h100
Waiting for pending jobs
Matrix: arm64 / test-jax / run-unit-test
Waiting for pending jobs
Matrix: arm64 / test-te-a100 / run-unit-test
Waiting for pending jobs
Matrix: arm64 / test-te-h100 / te-test-h100
Waiting for pending jobs
arm64  /  build-torchax
7m 27s
arm64 / build-torchax
arm64  /  test-nsys-jax-eks
arm64 / test-nsys-jax-eks
arm64  /  ...  /  launch-slurm-runner
arm64 / test-jax / runner / launch-slurm-runner
arm64  /  ...  /  launch-slurm-runner
arm64 / test-te-a100 / runner / launch-slurm-runner
arm64  /  build-upstream-t5x
9m 51s
arm64 / build-upstream-t5x
Matrix: arm64 / test-nsys-jax / run-unit-test
Waiting for pending jobs
arm64  /  ...  /  launch-slurm-runner
arm64 / test-nsys-jax / runner / launch-slurm-runner
Matrix: arm64 / test-nccl / nccl-test
Waiting for pending jobs
Matrix: arm64 / test-nccl / nccl-test-gke / nccl-gke
Waiting for pending jobs
amd64  /  ...  /  maxtext-gke-xpk
9m 19s
amd64 / test-maxtext-gke / maxtext-gke-xpk
Matrix: amd64 / test-maxtext / maxtext-multinode
Matrix: amd64 / test-maxtext / single-process-multi-device
amd64  /  ...  /  build-rosetta
15m 32s
amd64 / build-rosetta-t5x / build-rosetta
amd64  /  test-axlearn-eks
17m 17s
amd64 / test-axlearn-eks
amd64  /  test-axlearn-fuji-models-eks
5m 26s
amd64 / test-axlearn-fuji-models-eks
Matrix: amd64 / test-nsys-jax-archive
arm64  /  ...  /  maxtext-gke-xpk
arm64 / test-maxtext-gke / maxtext-gke-xpk
Matrix: arm64 / test-maxtext / maxtext-multinode
Waiting for pending jobs
Matrix: arm64 / test-maxtext / single-process-multi-device
Waiting for pending jobs
arm64  /  ...  /  build-rosetta
17m 0s
arm64 / build-rosetta-t5x / build-rosetta
arm64  /  test-axlearn-eks
0s
arm64 / test-axlearn-eks
arm64  /  test-axlearn-fuji-models-eks
0s
arm64 / test-axlearn-fuji-models-eks
Matrix: arm64 / test-nsys-jax-archive
amd64  /  ...  /  test-maxtext-metrics
20s
amd64 / test-maxtext / test-maxtext-metrics
amd64  /  collect-docker-tags
2s
amd64 / collect-docker-tags
Matrix: amd64 / test-rosetta-t5x / vit-multi-gpu-multi-node
arm64  /  ...  /  test-maxtext-metrics
arm64 / test-maxtext / test-maxtext-metrics
arm64  /  collect-docker-tags
4s
arm64 / collect-docker-tags
Matrix: arm64 / test-rosetta-t5x / vit-multi-gpu-multi-node
Waiting for pending jobs
amd64  /  ...  /  sitrep
11s
amd64 / test-maxtext / test-maxtext-sitrep / sitrep
amd64  /  ...  /  test-t5x-rosetta-summary
3s
amd64 / test-rosetta-t5x / test-t5x-rosetta-summary
amd64  /  ...  /  test-t5x-rosetta-metrics
33s
amd64 / test-rosetta-t5x / test-t5x-rosetta-metrics
arm64  /  ...  /  sitrep
arm64 / test-maxtext / test-maxtext-sitrep / sitrep
arm64  /  ...  /  test-t5x-rosetta-summary
arm64 / test-rosetta-t5x / test-t5x-rosetta-summary
arm64  /  ...  /  test-t5x-rosetta-metrics
arm64 / test-rosetta-t5x / test-t5x-rosetta-metrics
amd64  /  ...  /  test-maxtext-outcome
2s
amd64 / test-maxtext / test-maxtext-outcome
amd64  /  ...  /  sitrep
16s
amd64 / test-rosetta-t5x / test-t5x-rosetta-sitrep / sitrep
arm64  /  ...  /  test-maxtext-outcome
arm64 / test-maxtext / test-maxtext-outcome
arm64  /  ...  /  sitrep
arm64 / test-rosetta-t5x / test-t5x-rosetta-sitrep / sitrep
amd64  /  ...  /  test-t5x-rosetta-outcome
3s
amd64 / test-rosetta-t5x / test-t5x-rosetta-outcome
arm64  /  ...  /  test-t5x-rosetta-outcome
arm64 / test-rosetta-t5x / test-t5x-rosetta-outcome
make-publish-configs
4s
make-publish-configs
merge-new-manifest
9s
merge-new-manifest
Matrix: publish-containers
finalize  /  workflow-badge
6s
finalize / workflow-badge
finalize  /  report
22s
finalize / report
finalize  /  upload-badge
10s
finalize / upload-badge
finalize  /  publish-badge
5s
finalize / publish-badge
Fit to window
Zoom out
Zoom in

Annotations

4 errors and 2 warnings
amd64 / test-te-a100 / te-A100-unit-test
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
amd64 / test-maxtext / test-maxtext-outcome
Process completed with exit code 1.
amd64 / test-rosetta-t5x / test-t5x-rosetta-metrics
Process completed with exit code 1.
amd64 / test-rosetta-t5x / test-t5x-rosetta-outcome
Process completed with exit code 1.
merge-new-manifest
Unexpected input(s) 'owner_and_repo', valid inputs are ['route', 'mediaType']
merge-new-manifest
Unexpected input(s) 'owner_and_repo', 'head', 'base', 'body', 'title', 'draft', valid inputs are ['route', 'mediaType']

Artifacts

Produced during runtime
Name Size Digest
artifact-axlearn-build-amd64
566 Bytes
sha256:b46f6a7b07138ce36c68ddbf953ece15590c15a146f07f60bff7a0447e195b2b
artifact-axlearn-build-arm64
566 Bytes
sha256:80d0b5bcc9880554bd5b618f09a365c11b80eac796a5e8f8009439faabe50043
artifact-axlearn-test
179 KB
sha256:87fdc0d115220447e3a30a9ee3fd2c6b45adf65dbe2d19bdc8cb07e04607127d
artifact-base-build-amd64
568 Bytes
sha256:b9a166bb1b45d8225ecda5f56debc88ccc28172d36d596fca51a3839d4217a75
artifact-base-build-arm64
566 Bytes
sha256:e6fd8c976d019ac8416273ab919b84593dc1a9a764ae19c52d4e9903dae97add
artifact-equinox-build-amd64
570 Bytes
sha256:a962508d693c481bc12efa45f42f46e87184e0ca74c76964dcbedb1abf093f69
artifact-equinox-build-arm64
569 Bytes
sha256:b761a7321e711bc22867ad7f2807766f322f313b2249dba6de08a83537e13629
artifact-final-report
4.04 KB
sha256:b7fb10a50a31e8f767fdd2767dc008129b38afd8c804fe4394434b9b68582398
artifact-jax-build-amd64
554 Bytes
sha256:a7b24deb890192e1484f83f22aea55c06dea91a703a43be20db77b97008a6218
artifact-jax-build-arm64
554 Bytes
sha256:f0d8b39ee8773000af1ee78a2c8547351166a89da0349d7ccebcdf8a1dfa7169
artifact-maxtext-build-amd64
568 Bytes
sha256:56a92a8740148afd4bc87bb40b08f6c65748d449e2a17c00064bc4940106d80a
artifact-maxtext-build-arm64
568 Bytes
sha256:97685ee8a3d8a7dbe494efdb70a5a3ebd0f8a4f31e5686ed0c0fdf67eff5e18e
artifact-maxtext-test
1.46 KB
sha256:432c21902ad77c0e41947958e04edb82306a1288365d9f8432d6146cc03f2e74
artifact-mpi-operator-compatible-base-build-amd64
638 Bytes
sha256:323cd8edeab4cc1d154c895c40139ec04b1f205bd211feb232c3a6416d07ab42
artifact-nccl-gke-build-amd64
571 Bytes
sha256:c3079065d416980b681a600aff2b8b7322b053fb6ceff46293acedabbda88c50
artifact-rosetta-build-t5x-amd64
583 Bytes
sha256:e7f67a7809517b299e3058482ec5cc6dc132518542db881d23438ac3491b140e
artifact-rosetta-build-t5x-arm64
585 Bytes
sha256:4e55714490c0cc0398770a17abb99722570aff256a3a0c220696d10a3a92b6d1
artifact-rosetta-t5x-mgmn-test
624 Bytes
sha256:10129c3d222f4f6fb0717f3211907276a6729ed89a3f0e65841f213858c6770b
artifact-t5x-build-amd64
567 Bytes
sha256:36a4b919ef8a4728500dcdc015f306923fd14545945c198a44937da8d9b91dca
artifact-t5x-build-arm64
568 Bytes
sha256:87f730d4ed9e3120e1107212637f7bf9412a7286caa1fc6a5029c3af99b4dd19
artifact-torchax-build-amd64
568 Bytes
sha256:5ee666c9b6751c4cb69f941ad911bf838ee1daa294060be6284ff647b778b99b
artifact-torchax-build-arm64
568 Bytes
sha256:4c70c0aafebf4c52fb1d5ecb08d6ac2fe2d16944278e23f65883f4ac2fa43acc
artifact-workflow-metadata
277 Bytes
sha256:3d42c8e87fe274a6469a9436db1da8d683523031c787776a515dd4cc35d36889
bumped-manifest
51.6 KB
sha256:0038f6b5e5faf1b4e66120ae7d3d0b4b76ee88698b11a7e6dd96364fe06d2074
final-axlearn
258 Bytes
sha256:0e9db2b80d7e2408e4aacfde23d1b3dcc323f5c2d0cb3671368a9125eeae98dc
final-base
249 Bytes
sha256:4315368fb6f1e8f41e0fa7061b6ed98c795c8d3b8b3f0ce695980c222cfca719
final-equinox
258 Bytes
sha256:3a191cf4cd50397649d10e07df02aff76e68bf12f42b770e71e25e2185d4a988
final-jax
246 Bytes
sha256:87cdd8ad63a2c3428a31a19b8c7eebd94261afcfa6b0dc92884780833999ff3f
final-maxtext
258 Bytes
sha256:e13c4e792c9c738ae954652fd61caee7cd75710c989b8792eef259dfea9b7b3d
final-t5x
246 Bytes
sha256:c31fd8175bd717a1d6938ae3f67919ec0cf50dc28e678afca46925560db355db
final-upstream-t5x
273 Bytes
sha256:24f24921dba918ae73249e8ec17bf20cb098d6f7bc4df649c426d02165e41731
gke-maxtext-train
362 MB
sha256:1cc551200629c0d8e92e9ec44c374b079c31ae48ce9df519d1dfbf120385aa1f
gke-maxtext-train-sitrep
228 Bytes
sha256:25212afffe5f01f6fdf31fc184c39e6abe3d71c05063ad3df4f533994e19bead
jax-cutlass-test-H100
1.24 KB
sha256:6243d4cdd2a19338e08552815690a3f3a0c04d553bd2e5468ff14681010df5f0
jax-unit-test-A100
22.6 KB
sha256:0bbf578daf2d3a6fdbf72bf69f195bf51ed7efeb0b477a02eb39b90ffd184ca3
mealkit-axlearn
268 Bytes
sha256:029598650bc1d6894d32bbd14afbb95f7d17f45ddeab6f50e85e2b68c700952b
mealkit-equinox
269 Bytes
sha256:2134086b600b41f6fcde7d2b6187262063655e23af137c6c30ed88a18e74e319
mealkit-jax
256 Bytes
sha256:ec7577b59f6235c4b8daa7d76e30fedffcfe20f67ff6b20f15360bfba10a6534
mealkit-maxtext
269 Bytes
sha256:847de204fe83aae8062c2db526a9ee12e7aff88f183c78446930118c30cfe230
mealkit-t5x
257 Bytes
sha256:3928f34ada9f29cdec871878e27ab2070071c8ff39d0d486c3cd041e38b19af5
mealkit-upstream-t5x
283 Bytes
sha256:7299f364f6a9c3a6a31ea6a1f56d500bc6bfd556e8339e628d9aee478f32954c
nccl-gke-all-gather
15.4 KB
sha256:f725926189d1d403f2eec4cdee1a9bbf30c1dd46a7a0ec54557a473ae03fd4a0
nccl-gke-all-gather-sitrep
231 Bytes
sha256:3f0747b92382a30f82e81ffdb00befce59569dd9e1cb5a246a51d600f7e6d608
nccl-gke-all-reduce
15.6 KB
sha256:39adb9d50d47c4f5018327afe8791e203800da7f1aa7a74c1bbf26be986583ea
nccl-gke-all-reduce-sitrep
231 Bytes
sha256:e39a9476abe588a6dfe879f20ff64fdc377d08807b673ecf72e9a3cb405e5550
nccl-gke-broadcast
15.3 KB
sha256:15882ab170c2558474bf7cbf9b5c17bef5e637bdae8179703565c9839fcacdda
nccl-gke-broadcast-sitrep
229 Bytes
sha256:4531d3fdb5b1a2083577155ab1841d202342850f86ebdd894704974e7ea3a614
nccl-gke-reduce-scatter
15.6 KB
sha256:a640bb13d91bced4407b4a7e9a9df304365ab23dbc9c6b05a4613fa340a21eec
nccl-gke-reduce-scatter-sitrep
234 Bytes
sha256:ad8633cbd2ec8780a623561c662b7957c65f17c31a3f1c3f74751d0df7fb841f
nsys-jax-unit-test-A100
128 MB
sha256:f6d1f17814aef9a9fcbe2f684c36abf802bcc4a1b6336be7a6c58b13e087adc1
rosetta-t5x-vit-19261304344-VIT8G1N
15.6 KB
sha256:927c5b22d50f32a2dc043bc984529861f68ea40a572e3e6cba9301235abd4cc3
te-unit-test-H100
2.08 MB
sha256:171cdfbf93d32b8078cf76ba627ddefbfa0b63e77d02a9bbe76e668bf0f01de4
upstream-maxtext-19261304344-1DP2FSDP4TP1PP_single_process
23.7 KB
sha256:10f9d2fce7df3b6e9076508d33f9f974be051c03905c0c5e2084dbfa65345856
upstream-maxtext-19261304344-2DP2FSDP2TP1PP
33.6 KB
sha256:0f8a9742686f0246a41d43c28b2526df16b88216971a02dda42f8b4aeeef00c1
upstream-maxtext-metrics-test-log
2.52 KB
sha256:4028dabed1eecf696b564d9e4e4b3dbf3b7877a6dcb6857a4481a593db6753b0