Skip to content

[Issue]: Build fails with rocm 6.0.0 - undeclared identifier 'ncclFloat8e4m3' #133

@codambro

Description

@codambro

Problem Description

Building latest rccl-tests against rocm-6.0.0 and rccl-rocm-6.0.0:
Seeing errors such as:

make MPI=1 HIP_HOME=$ROCM_PATH MPI_HOME=/opt/cray/pe/mpich/default/ofi/cray/18.0 NCCL_HOME=../rccl-rocm-6.0.0/build

/scratch2/jenkins/qa/packages/x86_64/SLES-15.5/rccl-rocm-6.0.0/rccl-tests-aac5f2b56c1de08570152e0e457f234a5c1cc307/build/hipify/verifiable.cu.cpp
/scratch2/jenkins/qa/packages/x86_64/SLES-15.5/rccl-rocm-6.0.0/rccl-tests-aac5f2b56c1de08570152e0e457f234a5c1cc307/build/hipify/verifiable.cu.cpp:1326:8: error: use of undeclared identifier 'ncclFloat8e4m3'; did you mean 'ncclFloat64'?
  case ncclFloat8e4m3: CASE_TY(rccl_float8, uint8_t)
       ^~~~~~~~~~~~~~
       ncclFloat64
/scratch2/jenkins/qa/packages/x86_64/SLES-15.5/rccl-rocm-6.0.0/build/include/rccl/rccl.h:353:16: note: 'ncclFloat64' declared here
               ncclFloat64    = 8, ncclDouble     = 8,
               ^
/scratch2/jenkins/qa/packages/x86_64/SLES-15.5/rccl-rocm-6.0.0/rccl-tests-aac5f2b56c1de08570152e0e457f234a5c1cc307/build/hipify/verifiable.cu.cpp:1327:8: error: use of undeclared identifier 'ncclFloat8e5m2'; did you mean 'ncclFloat32'?
  case ncclFloat8e5m2: CASE_TY(rccl_bfloat8, uint8_t)
       ^~~~~~~~~~~~~~
       ncclFloat32
/scratch2/jenkins/qa/packages/x86_64/SLES-15.5/rccl-rocm-6.0.0/build/include/rccl/rccl.h:352:16: note: 'ncclFloat32' declared here
               ncclFloat32    = 7, ncclFloat      = 7,
               ^
/scratch2/jenkins/qa/packages/x86_64/SLES-15.5/rccl-rocm-6.0.0/rccl-tests-aac5f2b56c1de08570152e0e457f234a5c1cc307/build/hipify/verifiable.cu.cpp:1334:8: error: duplicate case value 'ncclFloat32'
  case ncclFloat32: CASE_TY(float, uint32_t)
       ^
/scratch2/jenkins/qa/packages/x86_64/SLES-15.5/rccl-rocm-6.0.0/rccl-tests-aac5f2b56c1de08570152e0e457f234a5c1cc307/build/hipify/verifiable.cu.cpp:1327:8: note: previous case defined here
  case ncclFloat8e5m2: CASE_TY(rccl_bfloat8, uint8_t)
       ^
/scratch2/jenkins/qa/packages/x86_64/SLES-15.5/rccl-rocm-6.0.0/rccl-tests-aac5f2b56c1de08570152e0e457f234a5c1cc307/build/hipify/verifiable.cu.cpp:1335:8: error: duplicate case value 'ncclFloat64'
  case ncclFloat64: CASE_TY(double, uint64_t)
       ^
/scratch2/jenkins/qa/packages/x86_64/SLES-15.5/rccl-rocm-6.0.0/rccl-tests-aac5f2b56c1de08570152e0e457f234a5c1cc307/build/hipify/verifiable.cu.cpp:1326:8: note: previous case defined here
  case ncclFloat8e4m3: CASE_TY(rccl_float8, uint8_t)
       ^
/scratch2/jenkins/qa/packages/x86_64/SLES-15.5/rccl-rocm-6.0.0/rccl-tests-aac5f2b56c1de08570152e0e457f234a5c1cc307/build/hipify/verifiable.cu.cpp:769:66: error: no member named 'mantissa_bits' in '(anonymous namespace)::FloatLayout<rccl_float8>'
  constexpr uint64_t mant_mask = (uint64_t(1) << FloatLayout<T>::mantissa_bits)-1;

Operating System

SLES-15.5

CPU

AMD

GPU

AMD

ROCm Version

ROCm 6.0.0

ROCm Component

No response

Steps to Reproduce

No response

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions