Skip to content

Conversation

@rapids-bot
Copy link
Contributor

@rapids-bot rapids-bot bot commented Nov 20, 2025

Forward-merge triggered by push to release/25.12 that creates a PR to keep main up-to-date. If this PR is unable to be immediately merged due to conflicts, it will remain open for the team to manually merge. See forward-merger docs for more info.

## Description
RAPIDS has deployed an autoscaling cloud build cluster that can be used
to accelerate building large RAPIDS projects.

This PR updates the conda and wheel builds to use the build cluster.

This contributes to
rapidsai/build-planning#228.
@rapids-bot rapids-bot bot requested review from a team as code owners November 20, 2025 17:31
@rapids-bot rapids-bot bot requested a review from gforsyth November 20, 2025 17:31
@GPUtester GPUtester merged commit 2e44f5a into main Nov 20, 2025
@rapids-bot
Copy link
Contributor Author

rapids-bot bot commented Nov 20, 2025

SUCCESS - forward-merge complete.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Nov 20, 2025

Greptile Overview

Greptile Summary

This forward-merge from release/25.12 integrates sccache-dist distributed build cluster support across the entire CI/CD pipeline. The changes enable faster compilation by distributing builds across a cluster and leveraging preprocessor-level caching in S3.

Key Changes:

  • Added node_type: cpu8 and authentication configuration to all GitHub Actions workflows
  • Updated build scripts to use sccache --stop-server instead of --zero-stats for better distributed mode handling
  • Configured preprocessor cache mode (SCCACHE_S3_USE_PREPROCESSOR_CACHE_MODE=true) across all build contexts
  • Added comprehensive sccache-dist environment variables with sensible defaults to conda recipes
  • Disabled distributed compilation for CMake compiler tests to avoid overhead (SCCACHE_NO_DIST_COMPILE=1)
  • Set appropriate timeouts (7140 seconds), retry policies (infinite retries), and error logging

Impact:
This is a infrastructure optimization that should significantly reduce build times without affecting functionality. All changes are configuration-only with no logic modifications to the codebase itself.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk - it only adds build infrastructure optimizations without changing application logic
  • Score reflects that all changes are purely configuration-related for sccache-dist integration. The changes are consistent across all files, follow established patterns, include proper error handling (e.g., 2>/dev/null || true), and are standard forward-merge changes from a release branch. No application code or logic is modified.
  • No files require special attention - this is a clean forward-merge of infrastructure improvements

Important Files Changed

File Analysis

Filename Score Overview
.github/workflows/build.yaml 5/5 Added node_type: cpu8 and sccache-dist-token-secret-name to all build jobs for distributed sccache support
.github/workflows/pr.yaml 5/5 Added node_type: cpu8 and sccache-dist-token-secret-name to all build and test jobs for distributed sccache support
ci/build_cpp.sh 5/5 Changed from sccache --zero-stats to sccache --stop-server pattern for better sccache-dist integration
ci/build_python.sh 5/5 Changed from sccache --zero-stats to sccache --stop-server pattern for better sccache-dist integration
ci/build_wheel.sh 5/5 Changed from sccache --zero-stats to sccache --stop-server and added preprocessor cache configuration for wheels
cmake/rapids_config.cmake 5/5 Added SCCACHE_NO_DIST_COMPILE=1 to disable distributed compilation for CMake compiler tests
conda/recipes/libwholegraph/recipe.yaml 5/5 Added comprehensive sccache-dist environment variables with defaults and preprocessor cache configuration
conda/recipes/pylibwholegraph/recipe.yaml 5/5 Added comprehensive sccache-dist environment variables with defaults and preprocessor cache configuration

Sequence Diagram

sequenceDiagram
    participant GH as GitHub Actions
    participant Build as Build Job
    participant Sccache as sccache-dist
    participant S3 as S3 Cache
    participant Rattler as rattler-build

    Note over GH,Rattler: sccache-dist Build Cluster Integration

    GH->>Build: Start with node_type: cpu8
    GH->>Build: Configure authentication
    
    Build->>Build: Set sccache-dist environment<br/>(timeouts, scheduler, auth type)
    Build->>Sccache: Stop existing server
    Note right of Sccache: Clean slate for new build

    Build->>Build: Enable preprocessor cache mode
    Note right of Build: SCCACHE_S3_USE_PREPROCESSOR_CACHE_MODE=true

    Build->>Rattler: Start build

    loop For each compilation
        Rattler->>Sccache: Request compilation
        Sccache->>S3: Check preprocessor cache
        alt Cache Hit
            S3-->>Sccache: Return cached result
        else Cache Miss
            Sccache->>Sccache: Compile on distributed cluster
            Sccache->>S3: Store in preprocessor cache
        end
        Sccache-->>Rattler: Return compiled object
    end

    Rattler-->>Build: Build complete
    Build->>Sccache: Show advanced stats
    Build->>Sccache: Stop server gracefully
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants