-
Notifications
You must be signed in to change notification settings - Fork 9
Description
We have recently upgraded our internal development cluster (Xeon Skylake Gold 6140/38 with SLES12 SP2). In rebuilding the STAT debug tool and its dependencies such as LaunchMON, we've encountered hang failures for LaunchMON smoketests test.attach_1 & test.launch_1.
Versions in use:
LaunchMon 1.0.2
gcc 5.4.0
openmpi 1.10.7
slurm 16.05.10-2
(These are the versions specified by our BKC build recipe. Our plan is to stage updating to newest versions after the baseline has been re-established.)
For our debug, we were hoping to gain access to logs/traces of successful runs of these two smoke tests on a similar configuration. We believe a differential analysis of this sort can help point us toward the configuration and build settings we need to adjust. We are additionally collecting strace logs to narrow down the hang point, but having trouble interpreting due to lack of in-depth familiarity with test operation and library operation.