-
Notifications
You must be signed in to change notification settings - Fork 70
[REVIEW] Feat: PR #2 decoding with nvImageCodec v0.6.0 #978
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release/25.12
Are you sure you want to change the base?
[REVIEW] Feat: PR #2 decoding with nvImageCodec v0.6.0 #978
Conversation
- Fix libjpeg-turbo cmake configuration for both cuslide and cuslide2 - Update nvimgcodec cmake dependency configuration - Update examples CMakeLists - Update build scripts and documentation
Signed-off-by: cdinea <[email protected]>
Signed-off-by: cdinea <[email protected]>
Signed-off-by: cdinea <[email protected]>
Signed-off-by: cdinea <[email protected]>
Co-authored-by: jakirkham <[email protected]>
Co-authored-by: jakirkham <[email protected]>
Co-authored-by: jakirkham <[email protected]>
Co-authored-by: jakirkham <[email protected]>
Signed-off-by: cdinea <[email protected]>
Signed-off-by: cdinea <[email protected]>
| fmt::print(" ℹ️ Using CPU buffer for ROI decoding\n"); | ||
| #endif | ||
| } | ||
| else if (gpu_available) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can gpu not be available? Is this a valid case if target is gpu? Maybe you should error instead? User is probably expecting to receive device buffer in this case, and if we return host one it can break things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| #endif | ||
| return true; // roi_stream, image, decode_future cleaned up by RAII | ||
| } | ||
| catch (const std::exception& e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This catch and others in nvimgodec/ directory are probably not needed. Wat can throw here? For sure CUDA and nvImageCodec API will not throw
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for the feedback @mkepa-nv -CUDA and nvImageCodec APIs don't throw - they return error codes. However, the catch blocks might still necessary because:
-
we throw std::runtime_error for error conditions (e.g., GPU availability check)
-
fmt::format() - can throw on formatting errors or memory allocation failures
-string operations, vector resizing, etc. can throw std::bad_alloc
| { | ||
| switch (kind) | ||
| { | ||
| case 1: // NVIMGCODEC_METADATA_KIND_MED_APERIO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use enum named values from nvImageCodec header
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already addressed @mkepa-nv -I've replaced all hardcoded integer values with proper enum constants
| { | ||
| loader->enqueue(std::move(decode_func), | ||
| cucim::loader::TileInfo{ location_index, index, tiledata_offset, tiledata_size }); | ||
| fmt::print("🔍 Executing decode_func directly (FORCED SINGLE-THREADED)\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are all those new prints temporary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for the feedback @jantonguirao - i wrapped debug prints in #ifdef DEBUG guards in this commit
| if (TARGET CUDA::nvjpeg_static) | ||
| target_link_libraries(${CUCIM_PLUGIN_NAME} | ||
| PRIVATE | ||
| # Add nvjpeg before cudart so that nvjpeg.h in static library takes precedence. | ||
| CUDA::nvjpeg_static | ||
| # Add CUDA::culibos to link necessary methods for 'deps::nvjpeg_static' | ||
| CUDA::culibos | ||
| CUDA::cudart | ||
| ) | ||
| else() | ||
| target_link_libraries(${CUCIM_PLUGIN_NAME} | ||
| PRIVATE | ||
| CUDA::nvjpeg | ||
| CUDA::cudart | ||
| ) | ||
| endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvImageCodec uses dlopen to load the nvjpeg shared-object library from system. Are we using nvjpeg directly anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for the feedback @jantonguirao - this is a good obeservation - currently cuslide2 is currently reusing ifd.cpp from the original cuslide plugin, which uses NvJpegProcessor for batch JPEG decoding. So the nvjpeg linking is needed -we couuld refactor ifd.cpp to use pure nvImageCodec and remove the nvjpeg dependency
| @@ -0,0 +1,256 @@ | |||
| /* | |||
| * Copyright (c) 2020-2022, NVIDIA CORPORATION. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
| * Copyright (c) 2020-2022, NVIDIA CORPORATION. | |
| * Copyright (c) 2020-2025, NVIDIA CORPORATION. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you @jantonguirao - updated the year in the header in this commit
| uint32_t width, uint32_t height, | ||
| uint8_t** output_buffer, | ||
| const cucim::io::Device& out_device); | ||
| #endif // CUCIM_HAS_NVIMGCODEC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You defined a dummy version of decode_ifd_region_nvimgcodec for the case that CUCIM_HAS_NVIMGCODEC is false (nvimgcodec_decoder.cpp:379) . Shouldn't you expose this declaration unconditionally then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for the feedback @jantonguirao - good catch , I addressed this feedback inthis commit
| @@ -0,0 +1,95 @@ | |||
| /* | |||
| * Copyright (c) 2021, NVIDIA CORPORATION. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for the feedback @jantonguirao - I update the year in the copyright header in this commit
|
|
||
| uint32_t ThreadBatchDataLoader::request(uint32_t load_size) | ||
| { | ||
| #ifdef DEBUG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If those debug messages are meant to stay, I'd recommend a preprocessor macro that would wrap fmt::print with the ifdef DEBUG condition. This way we avoid cluttering the codebase with many #ifdef DEBUG statements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a good point @jantonguirao - I made note to address this in a subsequent PR as this might affect the first cuslide plugin and we need to properly test this
scripts/test_aperio_svs.py
Outdated
| plugin_lib = repo_root / "cpp/plugins/cucim.kit.cuslide2/build-release/lib" | ||
|
|
||
| if not plugin_lib.exists(): | ||
| plugin_lib = repo_root / "install/lib" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
| plugin_lib = repo_root / "install/lib" | |
| plugin_lib = repo_root / "install" / "lib" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you @jantonguirao - good suggestion , addressed inthis commit
| } | ||
| } | ||
|
|
||
| TEST_CASE("Verify raw tiff read", "[test_read_rawtiff.cpp]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see some comment explaining what this test does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you @jantonguirao - i added a comment in [this(https://github.com//pull/978/commits/bab2aaadaa32032d42c8c9327efb838b489d6c00). @gigony FYI
|
/ok to test bab2aaa |
|
/ok to test 9a0e8f5 |
cuslide2: GPU-Accelerated decoding via nvImageCodec v0.6.0
Overview
cucim.kit.cuslide2plugin implementing GPU-accelerated whole-slide imaging (WSI) decoding using nvImageCodec v0.6.0. Replaces CPU-based decoders (libjpeg-turbo, OpenJPEG, libtiff) with GPU equivalentsVendor Support:
Key Features
cucim.kit.cuslideBuild Instructions
Create Conda Environment:
Build cuslide2 Plugin:
Install Python Package:
# Install in editable mode for development python -m pip install --editable python/cucimTesting
Unit Tests (26 tests)
cd python/cucim python -m pytest tests/unit/clara/test_tiff_read_region.py -vExpected Output:
================================== test session starts ================================== platform linux -- Python 3.13.9, pytest-8.4.2, pluggy-1.6.0 -- /home/cdinea/miniconda3/envs/cucimcuda/bin/python cachedir: .pytest_cache rootdir: /home/cdinea/Downloads/cucim_pr2/branchremote/cucim/python/cucim configfile: pyproject.toml plugins: cov-7.0.0, xdist-3.8.0, lazy-fixtures-1.4.0 collected 26 items tests/unit/clara/test_tiff_read_region.py::test_tiff_stripe_inner[testimg_tiff_stripe_32x24_16_jpeg] PASSED [ 3%] tests/unit/clara/test_tiff_read_region.py::test_tiff_stripe_boundary[testimg_tiff_stripe_32x24_16_jpeg] PASSED [ 7%] tests/unit/clara/test_tiff_read_region.py::test_tiff_stripe_outside[testimg_tiff_stripe_32x24_16_jpeg] PASSED [ 11%] tests/unit/clara/test_tiff_read_region.py::test_tiff_stripe_inner[testimg_tiff_stripe_32x24_16_deflate] PASSED [ 15%] tests/unit/clara/test_tiff_read_region.py::test_tiff_stripe_boundary[testimg_tiff_stripe_32x24_16_deflate] PASSED [ 19%] tests/unit/clara/test_tiff_read_region.py::test_tiff_stripe_outside[testimg_tiff_stripe_32x24_16_deflate] PASSED [ 23%] tests/unit/clara/test_tiff_read_region.py::test_tiff_stripe_inner[testimg_tiff_stripe_32x24_16_raw] PASSED [ 26%] tests/unit/clara/test_tiff_read_region.py::test_tiff_stripe_boundary[testimg_tiff_stripe_32x24_16_raw] PASSED [ 30%] tests/unit/clara/test_tiff_read_region.py::test_tiff_stripe_outside[testimg_tiff_stripe_32x24_16_raw] PASSED [ 34%] tests/unit/clara/test_tiff_read_region.py::test_tiff_outside_of_resolution_level[testimg_tiff_stripe_4096x4096_256_jpeg] PASSED [ 38%] tests/unit/clara/test_tiff_read_region.py::test_tiff_stripe_multiresolution[testimg_tiff_stripe_4096x4096_256_jpeg] PASSED [ 42%] tests/unit/clara/test_tiff_read_region.py::test_region_image_level_data[testimg_tiff_stripe_4096x4096_256_jpeg] PASSED [ 46%] tests/unit/clara/test_tiff_read_region.py::test_region_image_dtype[testimg_tiff_stripe_4096x4096_256_jpeg] PASSED [ 50%] tests/unit/clara/test_tiff_read_region.py::test_tiff_iterator[testimg_tiff_stripe_4096x4096_256_jpeg] PASSED [ 53%] tests/unit/clara/test_tiff_read_region.py::test_tiff_outside_of_resolution_level[testimg_tiff_stripe_4096x4096_256_deflate] PASSED [ 57%] tests/unit/clara/test_tiff_read_region.py::test_tiff_stripe_multiresolution[testimg_tiff_stripe_4096x4096_256_deflate] PASSED [ 61%] tests/unit/clara/test_tiff_read_region.py::test_region_image_level_data[testimg_tiff_stripe_4096x4096_256_deflate] PASSED [ 65%] tests/unit/clara/test_tiff_read_region.py::test_region_image_dtype[testimg_tiff_stripe_4096x4096_256_deflate] PASSED [ 69%] tests/unit/clara/test_tiff_read_region.py::test_tiff_iterator[testimg_tiff_stripe_4096x4096_256_deflate] PASSED [ 73%] tests/unit/clara/test_tiff_read_region.py::test_tiff_outside_of_resolution_level[testimg_tiff_stripe_4096x4096_256_raw] PASSED [ 76%] tests/unit/clara/test_tiff_read_region.py::test_tiff_stripe_multiresolution[testimg_tiff_stripe_4096x4096_256_raw] PASSED [ 80%] tests/unit/clara/test_tiff_read_region.py::test_region_image_level_data[testimg_tiff_stripe_4096x4096_256_raw] PASSED [ 84%] tests/unit/clara/test_tiff_read_region.py::test_region_image_dtype[testimg_tiff_stripe_4096x4096_256_raw] PASSED [ 88%] tests/unit/clara/test_tiff_read_region.py::test_tiff_iterator[testimg_tiff_stripe_4096x4096_256_raw] PASSED [ 92%] tests/unit/clara/test_tiff_read_region.py::test_array_interface_support PASSED [ 96%] tests/unit/clara/test_tiff_read_region.py::test_cuda_array_interface_support PASSED [100%] ================================== 26 passed in 4.04s ===================================Aperio SVS Validation
Tests:
Key Results:
`📥 Downloading Aperio SVS test file...
✅ Test file already exists: /tmp/CMU-1-Small-Region.svs
✅ Plugin configuration: /tmp/.cucim_aperio_test.json
🔬 Testing cuslide2 plugin with Aperio SVS
📁 File: /tmp/CMU-1-Small-Region.svs
📂 Loading SVS file...
🔧 Creating IFD[0] from nvImageCodec metadata
Dimensions: 2220x2967, 3 channels, 8 bits/sample
Codec: jpeg (compression=7)
✅ IFD[0] initialization complete
🔧 Creating IFD[1] from nvImageCodec metadata
Dimensions: 574x768, 3 channels, 8 bits/sample
✅ IFD[1] initialization complete
🔧 Creating IFD[2] from nvImageCodec metadata
Dimensions: 387x463, 3 channels, 8 bits/sample
✅ IFD[2] initialization complete
🔧 Creating IFD[3] from nvImageCodec metadata
Dimensions: 1280x431, 3 channels, 8 bits/sample
✅ IFD[3] initialization complete
✅ Loaded in 0.380s
📊 Image Information:
Dimensions: [2967, 2220, 3]
Levels: 3
Device: cpu
🔍 Resolution Levels:
Level 0: 2220x2967 (downsample: 1.0x)
Level 1: 1280x431 (downsample: 4.3x)
Level 2: 387x463 (downsample: 6.1x)
🚀 Testing GPU decode (nvImageCodec)...
✅ GPU decode successful!
Time: 0.5288s
Shape: [512, 512, 3]
🖥️ Testing CPU decode (baseline)...
✅ CPU decode successful!
Time: 0.0029s
📏 Testing larger tile (2048x2048)...
GPU: 0.0092s
CPU: 0.0174s
🎯 Speedup: 1.90x
✅ Test completed successfully!
`
Philips TIFF Validation
Tests:
Key Results:
`✅ Plugin configuration: /tmp/.cucim_philips_test.json
🔬 Testing Philips TIFF with cuslide2
📁 File: /tmp/Philips-1.tiff
📂 Loading Philips TIFF file...
🔧 Creating IFD[0] from nvImageCodec metadata
Dimensions: 45056x35840, 3 channels, 8 bits/sample
✅ IFD[0] initialization complete
[... 7 more IFDs ...]
✅ Loaded in 0.379s
📊 Image Information:
Format: Philips TIFF
Dimensions: [35840, 45056, 3]
Levels: 8
🔍 Resolution Levels:
Level 0: 45056x35840 (downsample: 1.0x)
Level 1: 22528x17920 (downsample: 2.0x)
Level 2: 11264x8960 (downsample: 4.0x)
Level 3: 5632x4480 (downsample: 8.0x)
Level 4: 2816x2240 (downsample: 16.0x)
Level 5: 1408x1120 (downsample: 32.0x)
Level 6: 704x560 (downsample: 64.0x)
Level 7: 352x280 (downsample: 128.0x)
📋 Philips Metadata:
✅ Found 22 Philips metadata entries
DICOM_PIXEL_SPACING: [0.000226891, 0.000226907]
DICOM_MANUFACTURER: Hamamatsu
PIM_DP_IMAGE_TYPE: WSI
... and 16 more entries
📏 Pixel Spacing:
DICOM Pixel Spacing: 0.2269 x 0.2269 μm/pixel
🚀 Testing GPU decode (nvImageCodec)...
✅ GPU decode successful!
Time: 0.5250s
Shape: [512, 512, 3]
🖥️ Testing CPU decode...
✅ CPU decode successful:
Time: 0.0014s
Pixel sum: 189181125, mean: 240.56
📏 Testing larger tile (2048x2048)...
✅ GPU: 0.0168s
🔀 Testing multi-level reads...
✅ Level 0: 0.0010s ([512, 512, 3])
✅ Level 1: 0.0009s ([512, 512, 3])
✅ Level 2: 0.0007s ([512, 512, 3])
✅ Philips TIFF test completed!
`
Technical Implementation
Decoding Pipeline:
NvImageCodecTiffParserextracts IFD metadata usingnvimgcodecCodeStreamCreateFromFilenvimgcodecCodeStreamCreateFromCodeStreamByIndexnvimgcodecDecoderDecodewith region parameters (x, y, width, height)cudaMallocfor GPU orcucim_mallocfor CPUMetadata Workarounds (nvImageCodec v0.6.0):
<DataObject ObjectType="DPUfsImport">) OR metadata blob kind=2 (MED_PHILIPS)Known Limitations
Migration Guide
No code changes required for existing cuCIM users: