Skip to content

Conversation

@kcossett-amd
Copy link
Contributor

@kcossett-amd kcossett-amd commented Nov 27, 2025

Motivation

Closes 569092

When doing this ticket, I also noticed the program would SEGFAULT when ROCPROFSYS_ROCM_DOMAINS=roctx even though the docs tell us we can do this. Went ahead and fixed that.

Also noticed that timemory push/pop in rocprofiler-sdk.cpp was always using category::rocm_marker_api instead of CategoryT. Fixed that as well.

Technical Details

  • Change a break to return in tool_tracing_callback_start. (See comment I left)
  • Timemory push/pop now uses CategoryT as opposed to category::rocm_marker_api
  • Added roctx as a valid domain choice. This avoids the SEGFAULT.

Note: This change means that we will NOT see roctxRangePop in the wall clock anymore. However, this shouldn't have been true in the first place considering perfetto output does not show this.

Test Plan

Checked against openmp-vv-host, roctx and custom marker program.

Test Result

With changes, wall_clock tree is now as it should be. HIP, HSA and OpenMP calls are also present in the wall clock.

Submission Checklist

@kcossett-amd kcossett-amd force-pushed the users/kcossett-amd/roctx-timemory-wallclock branch from c8b9e47 to 8c9545c Compare November 27, 2025 15:03
@kcossett-amd kcossett-amd marked this pull request as ready for review November 27, 2025 15:09
Copilot AI review requested due to automatic review settings November 27, 2025 15:09
@kcossett-amd kcossett-amd requested a review from a team as a code owner November 27, 2025 15:09
@kcossett-amd kcossett-amd removed the request for review from jrmadsen November 27, 2025 15:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes the wall clock tree generation for ROCTX markers by addressing several interconnected issues: incorrect callback flow control, improper category usage in timemory tracing, and a missing domain configuration that caused segfaults.

Key Changes:

  • Fixed wall clock tree accuracy by changing break to return in the ROCTX callback handler to prevent incorrect timemory state tracking
  • Corrected timemory push/pop calls to use proper category types (CategoryT and category::rocm_ompt_api) instead of hardcoded category::rocm_marker_api
  • Added roctx as a valid domain choice to prevent segfaults when ROCPROFSYS_ROCM_DOMAINS=roctx is set

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
projects/rocprofiler-systems/source/lib/rocprof-sys/library/rocprofiler-sdk.cpp Fixed callback control flow and category usage in tracing callbacks
projects/rocprofiler-systems/source/lib/core/rocprofiler-sdk.cpp Added roctx as valid domain configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@habajpai-amd habajpai-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also feel it was a bug, so changes look good to me.

@kcossett-amd kcossett-amd force-pushed the users/kcossett-amd/roctx-timemory-wallclock branch from ffc2bd5 to dc0b8ef Compare November 28, 2025 12:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants