Skip to content

Conversation

@brian-kelley
Copy link
Contributor

@brian-kelley brian-kelley commented Oct 26, 2025

@trilinos/ifpack2
@trilinos/tpetra

Motivation

Putting symbolic on device improves performance. This also resolves the failures SPARC saw from #14337 .

Related Issues

@brian-kelley brian-kelley self-assigned this Oct 26, 2025
@brian-kelley brian-kelley requested review from a team as code owners October 26, 2025 21:32
@brian-kelley brian-kelley added pkg: Ifpack2 client: SPARC Issues related to or needed more specifically by the ATDM SPARC code impacting: performance labels Oct 26, 2025
@github-actions
Copy link

CDash for AT1 results [Only accessible from Sandia networks]
CDash for AT2 results [Currently only accessible from Sandia networks]

@github-actions
Copy link

Your PR updated files that did not respect package formatting settings.
Please apply the patch given below. Alternatively you can download a patch file here.

Patch
diff --git a/packages/tpetra/core/src/Tpetra_Map_decl.hpp b/packages/tpetra/core/src/Tpetra_Map_decl.hpp
index 2658c20c..ce9dc24e 100644
--- a/packages/tpetra/core/src/Tpetra_Map_decl.hpp
+++ b/packages/tpetra/core/src/Tpetra_Map_decl.hpp
@@ -1092,123 +1092,125 @@ class Map : public Teuchos::Describable {
       const global_ordinal_type indexBase,
       const Teuchos::RCP<const Teuchos::Comm<int>>& comm);
 
-    //! Copy the local map from device to host, if it's not on host already
-    void lazyPushToHost() const;
-
-    //! The communicator over which this Map is distributed.
-    Teuchos::RCP<const Teuchos::Comm<int> > comm_;
-
-    //! The index base for global indices in this Map.
-    global_ordinal_type indexBase_;
-
-    /// \brief The total number of global indices in this Map over all
-    ///   processes in its communicator \c comm (see above).
-    global_size_t numGlobalElements_;
-
-    //! The number of global indices owned by this process.
-    size_t numLocalElements_;
-
-    //! The min global index owned by this process.
-    global_ordinal_type minMyGID_;
-
-    //! The max global index owned by this process.
-    global_ordinal_type maxMyGID_;
-
-    /// \brief The min global index in this Map over all processes in
-    ///   its communicator \c comm (see above).
-    global_ordinal_type minAllGID_;
-
-    /// \brief The max global index in this Map over all processes in
-    ///   its communicator \c comm (see above).
-    global_ordinal_type maxAllGID_;
-
-    /// \brief First contiguous GID.
-    ///
-    /// This is only set if the Map was created using the
-    /// noncontiguous constructor.  In that case, if the calling
-    /// process owns at least one GID, this will always equal that
-    /// first GID in the list of GIDs given to the constructor.
-    global_ordinal_type firstContiguousGID_;
-
-    /// \brief Last contiguous GID.
-    ///
-    /// This is only set if the Map was created using the
-    /// noncontiguous constructor.  In that case, if the calling
-    /// process owns at least one GID, this will always equal the last
-    /// GID (inclusive) that forms an initial sequence of contiguous
-    /// GIDs, in the list of GIDs given to the constructor.
-    ///
-    /// For example, if the list is [42, 43, 44, 45, 100, 1001],
-    /// firstContiguousGID_ will be 42 and lastContiguousGID_ will be
-    /// 45.  If the list is [42, 100, 1001, 1002, 1003],
-    /// firstContiguousGID_ will be 42 and lastContiguousGID_ will
-    /// also be 42.
-    global_ordinal_type lastContiguousGID_;
-
-    /// \brief Whether the range of global indices is uniform.
-    ///
-    /// This is only true if the Map was constructed using the first
-    /// (uniform contiguous) constructor or a nonmember constructor
-    /// that calls it.
-    bool uniform_;
-
-    //! Whether the range of global indices are contiguous and ordered.
-    bool contiguous_;
-
-    /// \brief Whether this map's global indices are distributed
-    ///   (true), or locally replicated (false), over its communicator
-    ///   \c comm (see above).
-    ///
-    /// This is true if the Map is globally distributed, and false
-    /// otherwise (if the Map is locally replicated).  See the
-    /// documentation of isDistributed() for a definition of these two
-    /// mutually exclusive terms.
-    bool distributed_;
-
-    /// \brief A mapping from local IDs to global IDs.
-    ///
-    /// By definition, this mapping is local; it only contains global
-    /// IDs owned by this process.  This mapping is created in two
-    /// cases:
-    ///
-    /// <ol>
-    /// <li> It is always created for a noncontiguous Map, in the
-    ///    noncontiguous version of the Map constructor.</li>
-    /// <li> In getLocalElementList(), on demand (if it wasn't created
-    ///    before).</li>
-    /// </ol>
-    ///
-    /// The potential for on-demand creation is why this member datum
-    /// is declared "mutable".  Note that other methods, such as
-    /// describe(), may invoke getLocalElementList().
-    ///
-    /// To clarify: If this is empty, then it could be either that the
-    /// Map is contiguous (meaning that we don't need to store all the
-    /// global indices explicitly), or that the Map really does
-    /// contain zero indices on the calling process.
-    ///
-    /// This has LayoutLeft so that we can call Kokkos::deep_copy to
-    /// copy this between any two Kokkos Devices.  Otherwise, the
-    /// Devices might have different default layouts, thus forbidding
-    /// a deep_copy.  We use LayoutLeft instead of LayoutRight because
-    /// LayoutRight is the default on non-CUDA Devices, and we want to
-    /// make sure we catch assignment or copying from the default to
-    /// the nondefault layout.
-    mutable Kokkos::View<const global_ordinal_type*,
-                         Kokkos::LayoutLeft,
-                         device_type> lgMap_;
-
-    /// \brief Host View of lgMap_.
-    ///
-    /// This is allocated along with lgMap_, on demand (lazily), by
-    /// getLocalElementList() (which see).  It is also used by
-    /// getGlobalElement() (which is a host method, and therefore
-    /// requires a host View) if necessary (only noncontiguous Maps
-    /// need this).
+  //! Copy the local map from device to host, if it's not on host already
+  void lazyPushToHost() const;
+
+  //! The communicator over which this Map is distributed.
+  Teuchos::RCP<const Teuchos::Comm<int>> comm_;
+
+  //! The index base for global indices in this Map.
+  global_ordinal_type indexBase_;
+
+  /// \brief The total number of global indices in this Map over all
+  ///   processes in its communicator \c comm (see above).
+  global_size_t numGlobalElements_;
+
+  //! The number of global indices owned by this process.
+  size_t numLocalElements_;
+
+  //! The min global index owned by this process.
+  global_ordinal_type minMyGID_;
+
+  //! The max global index owned by this process.
+  global_ordinal_type maxMyGID_;
+
+  /// \brief The min global index in this Map over all processes in
+  ///   its communicator \c comm (see above).
+  global_ordinal_type minAllGID_;
+
+  /// \brief The max global index in this Map over all processes in
+  ///   its communicator \c comm (see above).
+  global_ordinal_type maxAllGID_;
+
+  /// \brief First contiguous GID.
+  ///
+  /// This is only set if the Map was created using the
+  /// noncontiguous constructor.  In that case, if the calling
+  /// process owns at least one GID, this will always equal that
+  /// first GID in the list of GIDs given to the constructor.
+  global_ordinal_type firstContiguousGID_;
+
+  /// \brief Last contiguous GID.
+  ///
+  /// This is only set if the Map was created using the
+  /// noncontiguous constructor.  In that case, if the calling
+  /// process owns at least one GID, this will always equal the last
+  /// GID (inclusive) that forms an initial sequence of contiguous
+  /// GIDs, in the list of GIDs given to the constructor.
+  ///
+  /// For example, if the list is [42, 43, 44, 45, 100, 1001],
+  /// firstContiguousGID_ will be 42 and lastContiguousGID_ will be
+  /// 45.  If the list is [42, 100, 1001, 1002, 1003],
+  /// firstContiguousGID_ will be 42 and lastContiguousGID_ will
+  /// also be 42.
+  global_ordinal_type lastContiguousGID_;
+
+  /// \brief Whether the range of global indices is uniform.
+  ///
+  /// This is only true if the Map was constructed using the first
+  /// (uniform contiguous) constructor or a nonmember constructor
+  /// that calls it.
+  bool uniform_;
+
+  //! Whether the range of global indices are contiguous and ordered.
+  bool contiguous_;
+
+  /// \brief Whether this map's global indices are distributed
+  ///   (true), or locally replicated (false), over its communicator
+  ///   \c comm (see above).
+  ///
+  /// This is true if the Map is globally distributed, and false
+  /// otherwise (if the Map is locally replicated).  See the
+  /// documentation of isDistributed() for a definition of these two
+  /// mutually exclusive terms.
+  bool distributed_;
+
+  /// \brief A mapping from local IDs to global IDs.
+  ///
+  /// By definition, this mapping is local; it only contains global
+  /// IDs owned by this process.  This mapping is created in two
+  /// cases:
+  ///
+  /// <ol>
+  /// <li> It is always created for a noncontiguous Map, in the
+  ///    noncontiguous version of the Map constructor.</li>
+  /// <li> In getLocalElementList(), on demand (if it wasn't created
+  ///    before).</li>
+  /// </ol>
+  ///
+  /// The potential for on-demand creation is why this member datum
+  /// is declared "mutable".  Note that other methods, such as
+  /// describe(), may invoke getLocalElementList().
+  ///
+  /// To clarify: If this is empty, then it could be either that the
+  /// Map is contiguous (meaning that we don't need to store all the
+  /// global indices explicitly), or that the Map really does
+  /// contain zero indices on the calling process.
+  ///
+  /// This has LayoutLeft so that we can call Kokkos::deep_copy to
+  /// copy this between any two Kokkos Devices.  Otherwise, the
+  /// Devices might have different default layouts, thus forbidding
+  /// a deep_copy.  We use LayoutLeft instead of LayoutRight because
+  /// LayoutRight is the default on non-CUDA Devices, and we want to
+  /// make sure we catch assignment or copying from the default to
+  /// the nondefault layout.
+  mutable Kokkos::View<const global_ordinal_type*,
+                       Kokkos::LayoutLeft,
+                       device_type>
+      lgMap_;
+
+  /// \brief Host View of lgMap_.
+  ///
+  /// This is allocated along with lgMap_, on demand (lazily), by
+  /// getLocalElementList() (which see).  It is also used by
+  /// getGlobalElement() (which is a host method, and therefore
+  /// requires a host View) if necessary (only noncontiguous Maps
+  /// need this).
 #ifndef SWIG
-    mutable Kokkos::View<const global_ordinal_type*,
-                         Kokkos::LayoutLeft,
-                         Kokkos::HostSpace> lgMapHost_;
+  mutable Kokkos::View<const global_ordinal_type*,
+                       Kokkos::LayoutLeft,
+                       Kokkos::HostSpace>
+      lgMapHost_;
 #endif
 
   //! Type of a mapping from global IDs to local IDs.

More details about our use of clang-format and other tools can be found in the wiki.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection is Not Necessary for this Pull Request.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: PR_gcc-openmpi-openmp

  • Build Num: 2544
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-openmp_release-debug_static_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Tpetra;pkg: Ifpack2;impacting: performance;client: SPARC
PULLREQUESTNUM 14616
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA bc656bf
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA d6e582d

Build Information

Test Name: PR_CompSim

  • Build Num: 860
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
BLOCKING_BUILD false
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_aue-gcc-openmpi_release_static_no-kokkos-arch_no-asan_complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_compsim
PR_LABELS pkg: Tpetra;pkg: Ifpack2;impacting: performance;client: SPARC
PULLREQUESTNUM 14616
PULLREQUEST_CDASH_TRACK Pull Request (Non-blocking)
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA bc656bf
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA d6e582d

Using Repos:

Repo: TRILINOS (brian-kelley/Trilinos)
  • Branch: SymbolicRedux
  • SHA: bc656bf
  • Mode: TEST_REPO

Pull Request Author: brian-kelley

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: PR_gcc-openmpi-openmp

  • Build Num: 2544
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-openmp_release-debug_static_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Tpetra;pkg: Ifpack2;impacting: performance;client: SPARC
PULLREQUESTNUM 14616
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA bc656bf
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA d6e582d

Build Information

Test Name: PR_CompSim

  • Build Num: 860
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
BLOCKING_BUILD false
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_aue-gcc-openmpi_release_static_no-kokkos-arch_no-asan_complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_compsim
PR_LABELS pkg: Tpetra;pkg: Ifpack2;impacting: performance;client: SPARC
PULLREQUESTNUM 14616
PULLREQUEST_CDASH_TRACK Pull Request (Non-blocking)
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA bc656bf
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA d6e582d


CDash Test Results for PR# 14616.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
WARNING: NO REVIEWERS HAVE BEEN REQUESTED FOR THIS PULL REQUEST!

@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

@brian-kelley brian-kelley added the AT: RETEST Causes the PR autotester to run a new round of PR tests on the next iteration label Oct 27, 2025
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: PR_gcc-openmpi-openmp

  • Build Num: 2546
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-openmp_release-debug_static_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Tpetra;pkg: Ifpack2;impacting: performance;AT: RETEST;client: SPARC
PULLREQUESTNUM 14616
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA bc656bf
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA d6e582d

Build Information

Test Name: PR_CompSim

  • Build Num: 862
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
BLOCKING_BUILD false
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_aue-gcc-openmpi_release_static_no-kokkos-arch_no-asan_complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_compsim
PR_LABELS pkg: Tpetra;pkg: Ifpack2;impacting: performance;AT: RETEST;client: SPARC
PULLREQUESTNUM 14616
PULLREQUEST_CDASH_TRACK Pull Request (Non-blocking)
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA bc656bf
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA d6e582d

Using Repos:

Repo: TRILINOS (brian-kelley/Trilinos)
  • Branch: SymbolicRedux
  • SHA: bc656bf
  • Mode: TEST_REPO

Pull Request Author: brian-kelley

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: PR_gcc-openmpi-openmp

  • Build Num: 2546
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-openmp_release-debug_static_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Tpetra;pkg: Ifpack2;impacting: performance;AT: RETEST;client: SPARC
PULLREQUESTNUM 14616
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA bc656bf
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA d6e582d

Build Information

Test Name: PR_CompSim

  • Build Num: 862
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
BLOCKING_BUILD false
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_aue-gcc-openmpi_release_static_no-kokkos-arch_no-asan_complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_compsim
PR_LABELS pkg: Tpetra;pkg: Ifpack2;impacting: performance;AT: RETEST;client: SPARC
PULLREQUESTNUM 14616
PULLREQUEST_CDASH_TRACK Pull Request (Non-blocking)
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA bc656bf
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA d6e582d


CDash Test Results for PR# 14616.

@trilinos-autotester trilinos-autotester removed the AT: RETEST Causes the PR autotester to run a new round of PR tests on the next iteration label Oct 27, 2025
@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ csiefer2 ]!

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR...

1 similar comment
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR...

@brian-kelley
Copy link
Contributor Author

brian-kelley commented Oct 29, 2025

@trilinos/framework The cuda12 and cuda12-uvm tests keep timing out (6 hour limit) during the build. This is from the last cuda12 run. It looks like the build gets close to the end but then hangs for 4 hours (21:19 to 01:20).

2025-10-28T21:18:18.8093019Z [25556/26301] Building CXX object packages/panzer/adapters-stk/test/interpolation/CMakeFiles/PanzerAdaptersSTK_interpolation.dir/interpolation.cpp.o
2025-10-28T21:18:36.7207371Z [25557/26301] Building CXX object packages/panzer/adapters-stk/test/projection/CMakeFiles/PanzerAdaptersSTK_projection.dir/projection.cpp.o
2025-10-28T21:19:37.5515336Z [25558/26301] Building CXX object packages/panzer/adapters-stk/test/projection/CMakeFiles/PanzerAdaptersSTK_ProjectField.dir/project_field.cpp.o
2025-10-29T01:20:02.7995428Z ##[error]The operation was canceled.
2025-10-29T01:20:02.8089199Z Post job cleanup.
2025-10-29T01:20:02.8829046Z [command]/usr/bin/git version
2025-10-29T01:20:02.8866241Z git version 2.43.7
2025-10-29T01:20:02.8931859Z Temporarily overriding HOME='/home/runner/_work/_temp/b118c47b-4351-4520-9d49-7b4b5fe768cf' before making global git config changes
2025-10-29T01:20:02.8936657Z Adding repository directory to the temporary git global config as a safe directory

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR...

1 similar comment
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR...

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR...

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR...

@trilinos-autotester trilinos-autotester added the AT: STALE Added by the PR autotester if too much time has elapsed since the last successful PR test iteration label Nov 2, 2025
@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However PR is now STALE, and must be retested. Set the AT: RETEST Label to force retest....

3 similar comments
@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However PR is now STALE, and must be retested. Set the AT: RETEST Label to force retest....

@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However PR is now STALE, and must be retested. Set the AT: RETEST Label to force retest....

@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However PR is now STALE, and must be retested. Set the AT: RETEST Label to force retest....

@sebrowne
Copy link
Contributor

sebrowne commented Nov 5, 2025

@trilinos/framework The cuda12 and cuda12-uvm tests keep timing out (6 hour limit) during the build. This is from the last cuda12 run. It looks like the build gets close to the end but then hangs for 4 hours (21:19 to 01:20).

2025-10-28T21:18:18.8093019Z [25556/26301] Building CXX object packages/panzer/adapters-stk/test/interpolation/CMakeFiles/PanzerAdaptersSTK_interpolation.dir/interpolation.cpp.o
2025-10-28T21:18:36.7207371Z [25557/26301] Building CXX object packages/panzer/adapters-stk/test/projection/CMakeFiles/PanzerAdaptersSTK_projection.dir/projection.cpp.o
2025-10-28T21:19:37.5515336Z [25558/26301] Building CXX object packages/panzer/adapters-stk/test/projection/CMakeFiles/PanzerAdaptersSTK_ProjectField.dir/project_field.cpp.o
2025-10-29T01:20:02.7995428Z ##[error]The operation was canceled.
2025-10-29T01:20:02.8089199Z Post job cleanup.
2025-10-29T01:20:02.8829046Z [command]/usr/bin/git version
2025-10-29T01:20:02.8866241Z git version 2.43.7
2025-10-29T01:20:02.8931859Z Temporarily overriding HOME='/home/runner/_work/_temp/b118c47b-4351-4520-9d49-7b4b5fe768cf' before making global git config changes
2025-10-29T01:20:02.8936657Z Adding repository directory to the temporary git global config as a safe directory

That appears to be triggered by this change, I don't see any other timeouts in the recent results. It's generally unlikely for the build to take that long without a compiler hang given the use of ccache.

Work around parallel_scan issue; move BTD symbolic phase
to device again. And make Tpetra::Map::lazyPushToHost private again.

Signed-off-by: Brian Kelley <[email protected]>
@brian-kelley
Copy link
Contributor Author

I replaced this with #14665 to see if the hang would go away, but it still happened.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AT: STALE Added by the PR autotester if too much time has elapsed since the last successful PR test iteration client: SPARC Issues related to or needed more specifically by the ATDM SPARC code impacting: performance pkg: Ifpack2 pkg: Tpetra

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants