Skip to content

Issues with cudaHostAlloc - Pinned Memory in Container #226

@tstrutz

Description

@tstrutz

1. Issue or feature description

I am trying to figure out why my container cannot allocate a certain block of pinned memory past about 1GB of RAM. One of our algorithms uses a 2GB fixed memory pool of pinned memory that it pulls and returns memory to. This code works natively on all systems that I have tested it on, but in Docker it fails. I am using Docker Desktop for Windows with WSL2.

Here is the code I am using. I made it as simple as possible to just test the memory allocation issues:

#include <cuda_runtime_api.h>

#include <iostream>
#include <cstdio>
#include <stdio.h>
#include <stdexcept>
#include <string>

/**
 * @def checkCudaError( cudaError )
 * @brief Macro to call the cuda check function. Simply wrap this macro around any cuda runtime api library calls being made.
 * @param cudaError result from a cuda runtime api function call.
 * @ingroup CUDA
 */
#define checkCudaError( cudaError ) __checkCudaError( cudaError, __FILE__, __LINE__ )

/**
 * @brief Checks the return value of any cuda function for errors.
 * If an error occurs, the line number and error type is displayed for debugging purposes. Additionally,
 * the model will be terminated when an error is encountered. This method is preferred to the old method of
 * checking cuda errors through looking at the last error as it allows the error to be isolated to a single line
 * and file.
 * NOTE: This function is not explicitly called. It can only be called properly via the use of the macro checkCudaError
 * above.
 *
 * This can be used with kernal launches as well, by calling cudaPeekAtLastError and cudaDeviceSynchronize and
 * wrapping both functions with the macro above.
 * @ingroup CUDA
 * @param result_t result from a cuda runtime api library function
 * @param file File that the cuda check was used in
 * @param line Line number of the cuda check
 */
inline void __checkCudaError ( cudaError_t result_t, const char * file, const int line )
{
    std::string error_string;

    // Ignore both success and driver shutting down when throwing exceptions.
    if ( cudaSuccess != result_t && cudaErrorCudartUnloading != result_t )
    {
        fprintf ( stderr, "\x1B[31m CUDA error encountered in file '%s', line %d\n Error %d: %s\n Terminating FIRE!\n \x1B[0m", file, line, result_t,
               cudaGetErrorString ( result_t ) );
        std::cerr << "CUDA error encountered: " + std::string( cudaGetErrorString ( result_t ) ) + ". Terminating application." << std::endl;
        throw std::runtime_error ( "checkCUDAError : ERROR: CUDA Error" );
    }
}


int main( int argc, char * argv[] )
{
    void * data;
    void * gpu;
    checkCudaError(cudaSetDevice(0));
    checkCudaError(cudaMalloc(&gpu, 2147483648ull));
    int attr;
    checkCudaError(cudaDeviceGetAttribute(&attr, cudaDevAttrHostRegisterSupported, 0));
    std::cout << "Host Register supported: " << attr << std::endl;
    checkCudaError(cudaFree(gpu));

    checkCudaError(cudaMallocHost(&data, 1024ull*1024ull*1024ull*2ull));
    checkCudaError(cudaFreeHost(data));
}

My Dockerfile:

FROM nvidia/cuda:11.8.0-devel-ubuntu22.04
ARG DEBIAN_FRONTEND=noninteractive

RUN apt-get update

ENV CMAKE_VERSION 3.18.4
ENV CMAKE_SH cmake-${CMAKE_VERSION}-Linux-x86_64.sh
ENV CMAKE_URL https://github.com/Kitware/CMake/releases/download/v$CMAKE_VERSION/$CMAKE_SH

RUN apt-get install -y wget

RUN mkdir /cmake && cd /cmake \
    && wget --no-check-certificate $CMAKE_URL \
    && chmod +x ${CMAKE_SH} \
    && ./${CMAKE_SH} --prefix=/usr/local --skip-license \
    && cmake --version

ADD . /test
RUN mkdir -p /test/build

RUN cd /test/build && \
    cmake .. && make

WORKDIR /test/build/

ENTRYPOINT ["./test_cuda"]

CMake build script used for by the Dockerfile.

project(TEST LANGUAGES CUDA CXX)

find_package( CUDAToolkit REQUIRED)

add_executable( test_cuda test_cuda.cpp)
target_link_libraries( test_cuda PRIVATE CUDA::cudart)

I am using a T1000 GPU with 4GB of dedicated VRAM. All code runs natively on the system in question without issue.

2. Steps to reproduce the issue

  1. Create a folder with the following files:
    a. Dockerfile - Contents of Docker code above.
    b. test_cuda.cpp - Contents of C++/CUDA source code.
    c. CMakeLists.txt - Contents of CMake code above.
  2. Go to this folder in a terminal.
  3. Build the container using docker build -t cuda_test:latest.
  4. Run the container. These is the command I am currently using. docker run --gpus=all --ulimit memlock=-1 --rm cuda_test:latest

Resulting output:

 CUDA error encountered in file '/test/test_cuda.cpp', line 60
 Error 2: out of memory
 Terminating FIRE!
 CUDA error encountered: out of memory. Terminating application.
terminate called after throwing an instance of 'std::runtime_error'
  what():  checkCUDAError : ERROR: CUDA Error

3. Information to attach (optional if deemed irrelevant)

  • Some nvidia-container information: nvidia-container-cli -k -d /dev/tty info
  • Kernel version from uname -a
  • Any relevant kernel output lines from dmesg
  • Driver information from nvidia-smi -a
  • Docker version from docker version
  • NVIDIA packages version from dpkg -l '*nvidia*' or rpm -qa '*nvidia*'
  • NVIDIA container library version from nvidia-container-cli -V
  • NVIDIA container library logs (see troubleshooting)
  • Docker command, image and tag used

Metadata

Metadata

Assignees

No one assigned

    Labels

    lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions