Skip to content

Conversation

@The-Obstacle-Is-The-Way
Copy link

Summary

This PR adds support for NIfTI (.nii, .nii.gz) neuroimaging files in the Dataset Viewer, enabling viewing of brain imaging datasets such as BIDS-formatted neuroimaging studies.

Changes:

  • Bump datasets dependency from 4.1.1 to ^4.4.1 (includes native Nifti type)
  • Add nibabel ^5.0.0 dependency for NIfTI file handling
  • Implement full NIfTI processing pipeline: asset creation, URL signing, and row handling
  • Handle both encoded (bytes/dict) and decoded (nibabel) Nifti cells
  • Preserve original file extension (.nii vs .nii.gz)
  • Support all nibabel image types (Nifti1Image, Nifti2Image, etc.)
  • Add test fixtures with minimal 2x2x2 NIfTI test file

Files Modified

File Change
libs/libcommon/pyproject.toml Bump datasets, add nibabel
libs/libcommon/src/libcommon/viewer_utils/features.py Add nifti() handler and dispatch
libs/libcommon/src/libcommon/viewer_utils/asset.py Add NiftiSource and create_nifti_file()
libs/libcommon/src/libcommon/url_preparator.py Add Nifti URL signing
libs/libcommon/src/libcommon/viewer_utils/rows.py Prevent Nifti column truncation
libs/libapi/src/libapi/rows_utils.py Enable multithreaded Nifti uploads
libs/libcommon/tests/ Add Nifti test fixtures

Test Plan

  • All 3 new Nifti tests pass (test_get_cell_value_value[nifti-True/False], test_to_features_list[nifti])
  • All existing tests pass
  • Ruff linting passes on all modified source files

Affected Dataset

This fixes the Dataset Viewer for: https://huggingface.co/datasets/hugging-science/arc-aphasia-bids

Closes #3272

Add support for NIfTI (.nii, .nii.gz) neuroimaging files in the
Dataset Viewer. This enables viewing of brain imaging datasets
such as BIDS-formatted neuroimaging studies.

Changes:
- Bump datasets dependency from 4.1.1 to ^4.4.1 (includes Nifti type)
- Add nibabel ^5.0.0 dependency for NIfTI file handling
- Add NiftiSource TypedDict and create_nifti_file() in asset.py
- Add nifti() handler and dispatch in features.py
- Handle both encoded (bytes/dict) and decoded (nibabel) Nifti cells
- Update url_preparator.py for Nifti URL signing
- Update rows.py to prevent truncation of Nifti columns
- Update rows_utils.py for multithreaded Nifti uploads
- Add test fixtures with minimal 2x2x2 NIfTI test file

Closes #1 (Nifti Support for Neuroimaging Datasets)
- Use hasattr(value, 'to_bytes') instead of isinstance check
  to support Nifti1Image, Nifti2Image, and other nibabel types
- Preserve original file extension (.nii vs .nii.gz)
- Remove unused nibabel import
Regenerate all service and job lockfiles to pull in datasets 4.4.1
which includes the Nifti feature type. This fixes the CI failures
where services couldn't import Nifti from datasets.

Updated lockfiles:
- libs/libapi/poetry.lock
- services/admin/poetry.lock
- services/api/poetry.lock
- services/rows/poetry.lock
- services/search/poetry.lock
- services/sse-api/poetry.lock
- services/webhook/poetry.lock
- services/worker/poetry.lock
- jobs/cache_maintenance/poetry.lock
- jobs/mongodb_migration/poetry.lock
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add NIfTI neuroimaging file support

1 participant