Skip to content

Commit 79d8af4

Browse files
authored
Enable RDC (ROCm Data Center Tool) build in TheRock (#1981)
This change integrates ROCm Data Center Tools (RDC) support into TheRock and includes comprehensive updates across the build system, third-party dependency bundling, and documentation. Major changes: 1. RDC Feature Integration: - Adds a new build flag (THEROCK_ENABLE_DC_TOOLS) to enable data center tools as a group, including a specific flag (THEROCK_ENABLE_RDC) for ROCm Data Center Tool (RDC). These appear in CMake and are documented in the README. - Introduces a new CMake-controlled subdirectory (dctools/) for RDC, which is Linux-only and requires rocprofiler-sdk v3. - Integrates RDC validation and binary checks into the build and test systems. 2. gRPC Static Library Bundling: - Adds a new third-party/grpc/ directory to fetch, build, and bundle gRPC as a static library for RDC's use, with pre- and post-build scripts for correct static linking and symbol handling, primarily on Linux. - Ensures relocatability by patching install paths in pc and CMake files. 3. libcap Bundling on Linux: - Bundles libcap as a new third-party dependency under third-party/sysdeps/linux/libcap for privileged operations on Linux. - Provides full build, patching, and symbol versioning logic so the library can be bundled, relocated, and used safely within TheRock. - Integrates libcap as a dependency of RDC and documents its use in the developer docs. 4. Artifact and Build Infrastructure Updates: - Updates the artifact manifest and sysdeps dependency tracking to include libcap and gRPC in the correct build groups. - Ensures consistent interface, rpath, and packaging for all new libraries. 5. Documentation Updates: - Extends the README and developer docs to document new options, usage scenarios, and library inclusion (RDC, libcap, gRPC). - Updates RFC0007 to reflect the latest artifact layout for RDC and the static linking rationale. Signed-off-by: Wang, Yanyao <[email protected]>
1 parent efd37ed commit 79d8af4

19 files changed

+730
-2
lines changed

CMakeLists.txt

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -184,6 +184,7 @@ option(THEROCK_ENABLE_COMM_LIBS "Enable building of comm libraries" "${THEROCK_E
184184
option(THEROCK_ENABLE_MATH_LIBS "Enable building of math libraries" "${THEROCK_ENABLE_ALL}")
185185
option(THEROCK_ENABLE_ML_LIBS "Enable building of ML libraries" "${THEROCK_ENABLE_ALL}")
186186
option(THEROCK_ENABLE_PROFILER "Enable building the profiler libraries" "${THEROCK_ENABLE_ALL}")
187+
option(THEROCK_ENABLE_DC_TOOLS "Enable building of data center tools" "${THEROCK_ENABLE_ALL}")
187188
option(THEROCK_ENABLE_HOST_MATH "Build all bundled host math libraries by default" OFF)
188189
option(THEROCK_RESET_FEATURES "One-shot flag which forces all feature flags to their default state for this configuration run" OFF)
189190

@@ -261,6 +262,16 @@ else()
261262
set(_optional_profiler_requirements "")
262263
endif()
263264

265+
# Systems Tools Features
266+
if(NOT WIN32)
267+
# RDC is Linux-only and requires rocprofiler-sdk v3
268+
therock_add_feature(RDC
269+
GROUP DC_TOOLS
270+
DESCRIPTION "Enables ROCm Data Center Tool (RDC)"
271+
REQUIRES CORE_RUNTIME ROCPROFV3
272+
)
273+
endif()
274+
264275
# Comm-libs Features.
265276
if(NOT WIN32)
266277
# RCCL is Linux-only.
@@ -419,6 +430,7 @@ endif()
419430

420431
set(THEROCK_BUNDLED_BZIP2)
421432
set(THEROCK_BUNDLED_ELFUTILS)
433+
set(THEROCK_BUNDLED_LIBCAP)
422434
set(THEROCK_BUNDLED_LIBDRM)
423435
set(THEROCK_BUNDLED_LIBLZMA)
424436
set(THEROCK_BUNDLED_NUMACTL)
@@ -439,6 +451,7 @@ if(THEROCK_BUNDLE_SYSDEPS)
439451
endif()
440452
set(THEROCK_BUNDLED_BZIP2 therock-bzip2)
441453
set(THEROCK_BUNDLED_ELFUTILS therock-elfutils)
454+
set(THEROCK_BUNDLED_LIBCAP therock-libcap)
442455
set(THEROCK_BUNDLED_LIBDRM therock-libdrm)
443456
set(THEROCK_BUNDLED_LIBLZMA therock-liblzma)
444457
set(THEROCK_BUNDLED_NUMACTL therock-numactl)
@@ -480,6 +493,8 @@ add_subdirectory(core)
480493
# Note that rocprofiler-register is in base and is what higher level clients
481494
# depend on. The profiler itself is independent.
482495
add_subdirectory(profiler)
496+
# Data center tools (RDC, etc.)
497+
add_subdirectory(dctools)
483498
add_subdirectory(comm-libs)
484499
add_subdirectory(math-libs)
485500
add_subdirectory(ml-libs)

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -159,6 +159,7 @@ enable/disable selected subsets:
159159
| `-DTHEROCK_ENABLE_MATH_LIBS=OFF` | Disables all math libraries |
160160
| `-DTHEROCK_ENABLE_ML_LIBS=OFF` | Disables all ML libraries |
161161
| `-DTHEROCK_ENABLE_PROFILER=OFF` | Disables profilers |
162+
| `-DTHEROCK_ENABLE_DC_TOOLS=OFF` | Disables data center tools |
162163

163164
Individual features can be controlled separately (typically in combination with
164165
`-DTHEROCK_ENABLE_ALL=OFF` or `-DTHEROCK_RESET_FEATURES=ON` to force a
@@ -182,6 +183,7 @@ minimal build):
182183
| `-DTHEROCK_ENABLE_MIOPEN_PLUGIN=ON` | Enables MIOpen_plugin |
183184
| `-DTHEROCK_ENABLE_HIPDNN=ON` | Enables hipDNN |
184185
| `-DTHEROCK_ENABLE_ROCWMMA=ON` | Enables rocWMMA |
186+
| `-DTHEROCK_ENABLE_RDC=ON` | Enables ROCm Data Center Tool (Linux only) |
185187

186188
> [!TIP]
187189
> Enabling any features will implicitly enable their *minimum* dependencies. Some

dctools/CMakeLists.txt

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
# Data Center Tools
2+
3+
if(THEROCK_ENABLE_RDC)
4+
if(WIN32)
5+
message(FATAL_ERROR "RDC requires Linux (Windows not supported)")
6+
endif()
7+
8+
# Get gRPC stage directory for RDC's GRPC_ROOT requirement
9+
get_target_property(_grpc_stage therock-grpc THEROCK_STAGE_DIR)
10+
11+
therock_cmake_subproject_declare(rdc
12+
EXTERNAL_SOURCE_DIR "${THEROCK_ROCM_SYSTEMS_SOURCE_DIR}/projects/rdc"
13+
BINARY_DIR "${CMAKE_CURRENT_BINARY_DIR}/rdc"
14+
BACKGROUND_BUILD
15+
16+
CMAKE_ARGS
17+
-DBUILD_PROFILER=ON
18+
-DBUILD_STANDALONE=ON
19+
-DBUILD_RUNTIME=ON
20+
-DBUILD_RVS=OFF
21+
-DBUILD_TESTS=${THEROCK_BUILD_TESTING}
22+
-DHIP_PLATFORM=amd
23+
-DCMAKE_CXX_STANDARD=17
24+
-DGRPC_ROOT=${_grpc_stage}
25+
26+
BUILD_DEPS
27+
therock-grpc
28+
29+
RUNTIME_DEPS
30+
ROCR-Runtime
31+
amdsmi
32+
rocprofiler-sdk
33+
${THEROCK_BUNDLED_LIBCAP}
34+
${THEROCK_BUNDLED_ZLIB}
35+
36+
INSTALL_DESTINATION
37+
portable-rdc
38+
39+
INTERFACE_LINK_DIRS
40+
lib
41+
42+
INTERFACE_INSTALL_RPATH_DIRS
43+
lib
44+
)
45+
therock_cmake_subproject_glob_c_sources(rdc
46+
SUBDIRS .
47+
)
48+
therock_cmake_subproject_provide_package(rdc rdc lib/cmake/rdc)
49+
therock_cmake_subproject_activate(rdc)
50+
51+
# RDC validation tests (runtime tests excluded - require GPU hardware)
52+
53+
therock_test_validate_shared_lib(
54+
PATH rdc/stage/portable-rdc/lib
55+
LIB_NAMES
56+
librdc_bootstrap.so
57+
librdc_client.so
58+
)
59+
60+
# Verify gRPC symbols are hidden to prevent ODR violations
61+
add_test(
62+
NAME rdc-validate-no-grpc-symbol-pollution
63+
COMMAND bash -c "! nm -D ${CMAKE_CURRENT_BINARY_DIR}/rdc/stage/portable-rdc/lib/librdc_client.so* 2>/dev/null | grep -q ' T.*grpc::'"
64+
)
65+
set_tests_properties(rdc-validate-no-grpc-symbol-pollution PROPERTIES
66+
LABELS "rdc;symbol-visibility"
67+
)
68+
69+
# Verify binaries exist
70+
add_test(
71+
NAME rdc-verify-binaries-exist
72+
COMMAND bash -c "test -x ${CMAKE_CURRENT_BINARY_DIR}/rdc/stage/portable-rdc/bin/rdcd && test -x ${CMAKE_CURRENT_BINARY_DIR}/rdc/stage/portable-rdc/bin/rdci"
73+
)
74+
set_tests_properties(rdc-verify-binaries-exist PROPERTIES
75+
LABELS "rdc;build-verification"
76+
)
77+
78+
# Verify library SONAME structure
79+
add_test(
80+
NAME rdc-verify-library-sonames
81+
COMMAND bash -c "test -L ${CMAKE_CURRENT_BINARY_DIR}/rdc/stage/portable-rdc/lib/librdc.so && test -L ${CMAKE_CURRENT_BINARY_DIR}/rdc/stage/portable-rdc/lib/librdc_client.so && test -L ${CMAKE_CURRENT_BINARY_DIR}/rdc/stage/portable-rdc/lib/librdc_bootstrap.so"
82+
)
83+
set_tests_properties(rdc-verify-library-sonames PROPERTIES
84+
LABELS "rdc;build-verification"
85+
)
86+
87+
# Provide RDC artifact (portable-rdc/ structure per RFC0007)
88+
therock_provide_artifact(rdc
89+
TARGET_NEUTRAL
90+
DESCRIPTOR artifact-rdc.toml
91+
COMPONENTS
92+
dbg
93+
dev
94+
doc
95+
lib
96+
run
97+
test
98+
SUBPROJECT_DEPS
99+
amdsmi
100+
rocprofiler-sdk
101+
rdc
102+
${THEROCK_BUNDLED_LIBCAP}
103+
${THEROCK_BUNDLED_ZLIB}
104+
)
105+
106+
endif(THEROCK_ENABLE_RDC)

dctools/artifact-rdc.toml

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# RDC (ROCm Data Center Tool)
2+
# Portable distribution with statically-linked gRPC per RFC0007
3+
4+
[components.dbg."dctools/rdc/stage/portable-rdc"]
5+
6+
[components.dev."dctools/rdc/stage/portable-rdc"]
7+
8+
[components.doc."dctools/rdc/stage/portable-rdc"]
9+
include = [
10+
"share/doc/**",
11+
]
12+
13+
[components.lib."dctools/rdc/stage/portable-rdc"]
14+
include = [
15+
"lib/*.so*",
16+
"lib/rdc/*.so*",
17+
"lib/rdc/hsaco/**/*.hsaco",
18+
]
19+
20+
[components.run."dctools/rdc/stage/portable-rdc"]
21+
include = [
22+
"bin/**",
23+
"libexec/**",
24+
"share/rdc/conf/**",
25+
]
26+
27+
[components.test."dctools/rdc/stage/portable-rdc"]
28+
include = [
29+
"share/rdc/rdctst_tests/**",
30+
]

docs/development/dependencies.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ project wide:
2626
bundling is not enabled or supported for the target OS):
2727
- `THEROCK_BUNDLED_BZIP2`
2828
- `THEROCK_BUNDLED_ELFUTILS`
29+
- `THEROCK_BUNDLED_LIBCAP`
2930
- `THEROCK_BUNDLED_LIBDRM`
3031
- `THEROCK_BUNDLED_LIBLZMA`
3132
- `THEROCK_BUNDLED_NUMACTL`
@@ -68,6 +69,14 @@ Supported sub-libraries: `libelf`, `libdw`.
6869
- Import library: `libdw::libdw`
6970
- Alternatives: `pkg_check_modules(DW libdw)`
7071

72+
## libcap
73+
74+
Provides Linux capabilities for privileged operations (used by RDC).
75+
76+
- Canonical method: `find_package(Libcap)`
77+
- Import library: `Libcap::Libcap`
78+
- Alternatives: `pkg_check_modules(LIBCAP libcap)` or direct linking (used by RDC)
79+
7180
## libdrm
7281

7382
Supported sub-libraries: `libdrm`, `libdrm_amdgpu`

docs/rfcs/RFC0007-rdc-therock-integration.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -67,8 +67,10 @@ portable-rdc/
6767
│ ├── librdc_rocr.so # 1MB (optional module)
6868
│ ├── librdc_rocp.so # 500KB (optional module)
6969
│ └── librdc_rvs.so # 200KB (optional module)
70-
└── etc/
71-
└── rdc_options.conf
70+
└── share/
71+
└── rdc/
72+
└── conf/
73+
└── rdc_options.conf # User-level configuration
7274
```
7375

7476
**Static Linking Rationale:**

third-party/CMakeLists.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,3 +37,7 @@ if(THEROCK_BUNDLE_SYSDEPS)
3737
add_subdirectory(sysdeps/windows)
3838
endif()
3939
endif()
40+
41+
# gRPC: Static library for RDC (built on-demand when depended upon)
42+
# See: docs/rfcs/RFC0007-rdc-therock-integration.md
43+
add_subdirectory(grpc)

third-party/grpc/CMakeLists.txt

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
# gRPC static library for RDC
2+
therock_subproject_fetch(therock-grpc-sources
3+
CMAKE_PROJECT
4+
# Originally mirrored from: https://github.com/grpc/grpc.git (git tag v1.67.1)
5+
URL "https://rocm-third-party-deps.s3.us-east-2.amazonaws.com/grpc-v1.67.1.tar.gz"
6+
URL_HASH "SHA256=ed0051771a1ed0ec6583be4ddb0fe9ee4473bc2f8a59d214ff49290ed5edb4ef"
7+
)
8+
9+
set(_binary_dir "${CMAKE_CURRENT_BINARY_DIR}/build")
10+
11+
therock_cmake_subproject_declare(therock-grpc
12+
EXTERNAL_SOURCE_DIR "${CMAKE_CURRENT_BINARY_DIR}/source"
13+
BINARY_DIR "${_binary_dir}"
14+
NO_MERGE_COMPILE_COMMANDS
15+
BACKGROUND_BUILD
16+
OUTPUT_ON_FAILURE
17+
18+
CMAKE_ARGS
19+
# Static linking configuration
20+
-DBUILD_SHARED_LIBS=OFF
21+
-DgRPC_BUILD_SHARED_LIBS=OFF
22+
-DgRPC_PREFER_STATIC_LIBS=ON
23+
-Dprotobuf_BUILD_SHARED_LIBS=OFF
24+
25+
# Dependency providers
26+
-DgRPC_ZLIB_PROVIDER=package
27+
-DgRPC_ABSL_PROVIDER=module
28+
-DgRPC_CARES_PROVIDER=module
29+
-DgRPC_RE2_PROVIDER=module
30+
-DgRPC_SSL_PROVIDER=module
31+
-DgRPC_PROTOBUF_PROVIDER=module
32+
33+
# Disable tests and unnecessary language plugins
34+
-DgRPC_BUILD_TESTS=OFF
35+
-DgRPC_BUILD_CSHARP_EXT=OFF
36+
-DgRPC_BUILD_GRPC_CSHARP_PLUGIN=OFF
37+
-DgRPC_BUILD_GRPC_NODE_PLUGIN=OFF
38+
-DgRPC_BUILD_GRPC_OBJECTIVE_C_PLUGIN=OFF
39+
-DgRPC_BUILD_GRPC_PHP_PLUGIN=OFF
40+
-DgRPC_BUILD_GRPC_PYTHON_PLUGIN=OFF
41+
-DgRPC_BUILD_GRPC_RUBY_PLUGIN=OFF
42+
43+
# Installation configuration
44+
-DgRPC_INSTALL=ON
45+
-DCMAKE_INSTALL_LIBDIR=lib
46+
"-DCMAKE_INSTALL_RPATH=\\$ORIGIN:\\$ORIGIN/../lib"
47+
-DCMAKE_BUILD_RPATH_USE_ORIGIN=ON
48+
-DCMAKE_SKIP_BUILD_RPATH=OFF
49+
50+
RUNTIME_DEPS
51+
therock-zlib
52+
)
53+
therock_cmake_subproject_provide_package(therock-grpc gRPC lib/cmake/grpc)
54+
therock_cmake_subproject_provide_package(therock-grpc protobuf lib/cmake/protobuf)
55+
therock_cmake_subproject_activate(therock-grpc)
56+
add_dependencies(therock-third-party therock-grpc)
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
#!/bin/bash
2+
set -euo pipefail
3+
4+
# gRPC source tarball generation script
5+
6+
GRPC_VERSION="${GRPC_VERSION:-v1.67.1}"
7+
GRPC_TAG="${GRPC_VERSION}"
8+
OUTPUT_DIR="${PWD}/tarballs"
9+
TEMP_DIR=$(mktemp -d)
10+
11+
echo "===================================="
12+
echo "gRPC Tarball Generation"
13+
echo "===================================="
14+
echo "Version: ${GRPC_VERSION}"
15+
echo "Tag: ${GRPC_TAG}"
16+
echo "Output: ${OUTPUT_DIR}"
17+
echo "Temp: ${TEMP_DIR}"
18+
echo ""
19+
20+
# Create output directory
21+
mkdir -p "${OUTPUT_DIR}"
22+
23+
# Clone gRPC repository
24+
echo "Step 1: Cloning gRPC repository..."
25+
cd "${TEMP_DIR}"
26+
git clone --depth 1 --branch "${GRPC_TAG}" --recurse-submodules \
27+
https://github.com/grpc/grpc.git grpc-${GRPC_VERSION}
28+
29+
# Enter the cloned directory
30+
cd "grpc-${GRPC_VERSION}"
31+
32+
# Ensure all submodules are initialized and updated
33+
echo ""
34+
echo "Step 2: Syncing submodules..."
35+
git submodule sync --recursive
36+
git submodule update --init --recursive --depth 1
37+
38+
# Remove all .git directories and files
39+
echo ""
40+
echo "Step 3: Removing .git directories..."
41+
find . -name ".git" -exec rm -rf {} + 2>/dev/null || true
42+
find . -name ".gitignore" -delete
43+
find . -name ".gitmodules" -delete
44+
find . -name ".gitattributes" -delete
45+
46+
# Return to temp directory
47+
cd "${TEMP_DIR}"
48+
49+
# Create tarball
50+
echo ""
51+
echo "Step 4: Creating tarball..."
52+
TARBALL_NAME="grpc-${GRPC_VERSION}.tar.gz"
53+
tar -czf "${OUTPUT_DIR}/${TARBALL_NAME}" "grpc-${GRPC_VERSION}"
54+
55+
# Calculate SHA256 hash
56+
echo ""
57+
echo "Step 5: Calculating SHA256 hash..."
58+
cd "${OUTPUT_DIR}"
59+
SHA256=$(sha256sum "${TARBALL_NAME}" | awk '{print $1}')
60+
61+
# Display results
62+
echo ""
63+
echo "===================================="
64+
echo "Tarball created successfully!"
65+
echo "===================================="
66+
echo "File: ${OUTPUT_DIR}/${TARBALL_NAME}"
67+
echo "Size: $(du -h "${TARBALL_NAME}" | cut -f1)"
68+
echo "SHA256: ${SHA256}"
69+
echo ""

0 commit comments

Comments
 (0)