Skip to content

Conversation

@achauphan
Copy link
Contributor

@achauphan achauphan commented Nov 18, 2025

@trilinos/framework

Motivation

Want to upgrade our CUDA jobs to target new CUDA runners using GCC12 instead of GCC10. CUDA with GCC10 was causing issues with its C++20 support. Simple solution is to upgrade the version of GCC in the CUDA containers.

Related Issues

Stakeholder Feedback

Testing

Did a manual build and test of Trilinos with all-no-epetra packages enabled. Will also test this through CI in this PR.

@achauphan achauphan requested a review from a team November 18, 2025 20:15
@achauphan achauphan self-assigned this Nov 18, 2025
@achauphan achauphan added PA: Framework Issues that fall under the Trilinos Framework Product Area AT2-SpecialApprove (Beta) Special approval label for AT2. labels Nov 18, 2025
@github-actions
Copy link

CDash for AT1 results [Only accessible from Sandia networks]
CDash for AT2 results [Currently only accessible from Sandia networks]

@achauphan achauphan added the AT: AUTOMERGE Causes the PR autotester to automatically merge the PR branch once approvals are completed label Nov 18, 2025
@achauphan
Copy link
Contributor Author

As of opening this PR, only one cuda-gcc12-openmpi runner is stood up. Leaving two cuda-gcc10-openmpi runners up.

Once this PR is merged, another cuda-gcc10-openmpi runner will be replaced with the cuda-gcc12-openmpi runner. Then after a week or so, we will take down the last cuda-gcc10-openmpi and replace that with the new runner as well.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection is Not Necessary for this Pull Request.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: PR_gcc-openmpi-openmp

  • Build Num: 2720
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-openmp_release-debug_static_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS AT: AUTOMERGE;PA: Framework;AT2-SpecialApprove
PULLREQUESTNUM 14710
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/achauphan/Trilinos
TRILINOS_SOURCE_SHA 1348ce8
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 81f1de3

Using Repos:

Repo: TRILINOS (achauphan/Trilinos)
  • Branch: upgrade-cuda-gcc12
  • SHA: 1348ce8
  • Mode: TEST_REPO

Pull Request Author: achauphan

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED

Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run.

Pull Request Auto Testing has FAILED (click to expand)

Build Information

Test Name: PR_gcc-openmpi-openmp

  • Build Num: 2720
  • Status: FAILED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-openmp_release-debug_static_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS AT: AUTOMERGE;PA: Framework;AT2-SpecialApprove
PULLREQUESTNUM 14710
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/achauphan/Trilinos
TRILINOS_SOURCE_SHA 1348ce8
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 81f1de3


CDash Test Results for PR# 14710.


Wiki: How to Reproduce PR Testing Builds and Errors.

@achauphan
Copy link
Contributor Author

It seems like there are issues with git operations over http & ssh, hence why git clone is returning 500.

@sebrowne sebrowne added the AT: RETEST Causes the PR autotester to run a new round of PR tests on the next iteration label Nov 19, 2025
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Using Repos:

Repo: TRILINOS (achauphan/Trilinos)
  • Branch: upgrade-cuda-gcc12
  • SHA: 1348ce8
  • Mode: TEST_REPO

Pull Request Author: achauphan

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)


CDash Test Results for PR# 14710.

@trilinos-autotester trilinos-autotester removed the AT: RETEST Causes the PR autotester to run a new round of PR tests on the next iteration label Nov 19, 2025
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ sebrowne ]!

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged

@trilinos-autotester
Copy link
Contributor

Cannot Merge Pull Request# 14710: Github Mergeability = blocked; Git Mergeable = True

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged

@trilinos-autotester
Copy link
Contributor

Cannot Merge Pull Request# 14710: Github Mergeability = blocked; Git Mergeable = True

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged

@trilinos-autotester
Copy link
Contributor

Cannot Merge Pull Request# 14710: Github Mergeability = blocked; Git Mergeable = True

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged

@trilinos-autotester
Copy link
Contributor

Cannot Merge Pull Request# 14710: Github Mergeability = blocked; Git Mergeable = True

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged

@trilinos-autotester
Copy link
Contributor

Cannot Merge Pull Request# 14710: Github Mergeability = blocked; Git Mergeable = True

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged

@trilinos-autotester
Copy link
Contributor

Cannot Merge Pull Request# 14710: Github Mergeability = blocked; Git Mergeable = True

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged

@trilinos-autotester
Copy link
Contributor

Cannot Merge Pull Request# 14710: Github Mergeability = blocked; Git Mergeable = True

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged

@trilinos-autotester
Copy link
Contributor

Cannot Merge Pull Request# 14710: Github Mergeability = blocked; Git Mergeable = True

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged

@trilinos-autotester
Copy link
Contributor

Cannot Merge Pull Request# 14710: Github Mergeability = blocked; Git Mergeable = True

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged

@trilinos-autotester
Copy link
Contributor

Cannot Merge Pull Request# 14710: Github Mergeability = blocked; Git Mergeable = True

@achauphan achauphan removed the AT: AUTOMERGE Causes the PR autotester to automatically merge the PR branch once approvals are completed label Nov 19, 2025
@achauphan
Copy link
Contributor Author

@trilinos/zoltan2 @cgcgcg could someone please take a look at the failing Zoltan2 test, Zoltan2_TPLTraits_MPI_4 for this upgraded cuda-gcc-12 AT2 job?

GOOD: big long long to int throws exception ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:103
GOOD: big size_t to int throws exception ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:111
GOOD: huge unsigned to int throws exception ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:119
GOOD: huge size_t to int64_t threw exception ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:176
back 17179869201 != zgno 68719476761
FAIL: ZOLTAN_ID_PTR to unsigned long long ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:539
back 17179869201 != zgno 18446744073709551615
FAIL: ZOLTAN_ID_PTR to size_t ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:572
FAIL
GOOD: big long long to int throws exception ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:103
GOOD: big size_t to int throws exception ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:111
GOOD: huge unsigned to int throws exception ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:119
GOOD: huge size_t to int64_t threw exception ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:176
back 17179869201 != zgno 68719476761
FAIL: ZOLTAN_ID_PTR to unsigned long long ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:539
back 17179869201 != zgno 18446744073709551615
FAIL: ZOLTAN_ID_PTR to size_t ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:572
FAIL
GOOD: big long long to int throws exception ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:103
GOOD: big size_t to int throws exception ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:111
GOOD: huge unsigned to int throws exception ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:119
GOOD: huge size_t to int64_t threw exception ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:176
back 17179869201 != zgno 68719476761
FAIL: ZOLTAN_ID_PTR to unsigned long long ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:539
back 17179869201 != zgno 18446744073709551615
FAIL: ZOLTAN_ID_PTR to size_t ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:572
FAIL
GOOD: big long long to int throws exception ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:103
GOOD: big size_t to int throws exception ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:111
GOOD: huge unsigned to int throws exception ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:119
GOOD: huge size_t to int64_t threw exception ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:176
back 17179869201 != zgno 68719476761
FAIL: ZOLTAN_ID_PTR to unsigned long long ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:539
back 17179869201 != zgno 18446744073709551615
FAIL: ZOLTAN_ID_PTR to size_t ../../runner/_work/Trilinos/Trilinos/packages/zoltan2/test/core/unit/util/TPLTraits.cpp:572
FAIL

I am not sure about the failing Tempus tests as they were not present on a previous run.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR...

@cgcgcg
Copy link
Contributor

cgcgcg commented Nov 19, 2025

@achauphan @egboman I'm not quite sure what's going on. It looks like the failing test is checking that different integer types can be converted back and forth. I don't think this caused by the compiler change. I am using GCC 12.3.0 locally and the test passes. It has clearly nothing to do with Cuda. In what other way is this image different from the previous one?

@achauphan
Copy link
Contributor Author

@cgcgcg from looking at our container history, other than the change with gcc version which we input as a build-arg, there were no other changes made from the previous container as they both used the same dated version of the Dockerfile :/

I will try to manually reproduce this again.

@sebrowne
Copy link
Contributor

There MAY be other TPL changes depending on how the concretization worked with Spack (e.g. the GCC 10 was incompatible with a newer ParMETIS). You can check the Configure output on CDash to see the differences in the TPLs that were found (if there were any) between the new build and an old one.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR...

2 similar comments
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR...

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AT2-SpecialApprove (Beta) Special approval label for AT2. PA: Framework Issues that fall under the Trilinos Framework Product Area

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants