Skip to content

Conversation

@darshdinger
Copy link
Contributor

Short description of the changes:

This PR adds support for cataloging FITS and TIFF image files for the VENUS instrument to address the 30-minute delay when cataloging 1500+ images. The implementation uses ONCat's batch API to efficiently catalog images in groups of 50, discovering image files from the PV metadata.entry.daslogs.bl10:exp:im:imagefilepath.value.

Long description of the changes:

The VENUS instrument produces 1500+ image files per experiment, which previously took 30 minutes to catalog individually. This PR moves the image cataloging from the DAQ side to the post-processing agent and implements efficient batch cataloging.

Key changes:

  • Added image_files() function to discover FITS and TIFF images from metadata paths
  • Added batches() helper function to split image lists into chunks of 50
  • Updated ingest() method in ONCatProcessor to catalog images using oncat.Datafile.batch()
  • The feature only activates when the specific PV exists in the datafile metadata, ensuring no impact on other instruments

Testing:

  • Added 12 comprehensive unit tests covering all new functions and edge cases
  • Added integration test test_oncat_catalog_venus_images() to verify end-to-end workflow
  • Updated fake ONCat server to support /api/datafiles/batch endpoint
  • All existing tests continue to pass with no regressions

The implementation closely follows the prototype script ingest-datafile and is isolated to the ONCat cataloging processor.

Check list for the pull request

  • I have read the [CONTRIBUTING]
  • I have read the [CODE_OF_CONDUCT]
  • I have added tests for my changes
  • I have updated the documentation accordingly

Check list for the reviewer

  • I have read the [CONTRIBUTING]
  • I have verified the proposed changes
  • best software practices
    • all internal functions have an underbar, as is python standard
    • clearly named variables (better to be verbose in variable names)
    • code comments explaining the intent of code blocks
  • All the tests are passing
  • The documentation is up to date
  • code comments added when explaining intent

Manual test for the reviewer

Prerequisites:

  • Docker and docker-compose installed
  • Access to integration test environment

Testing steps:

  1. Run unit tests to verify new functionality:

    pixi run test-unit

    Expected: All 55 tests pass (12 new tests for oncat_processor)

  2. Run integration tests (requires Docker):

    cd tests/integration
    docker-compose up -d
    pixi run test-integration
    docker-compose down

    Expected: The new test_oncat_catalog_venus_images test passes, verifying:

    • Main NeXus file is cataloged
    • Batch API is called with 3 image files
    • All image files are logged in the fake ONCat server
  3. Verify backward compatibility:

    • Existing CORELLI and EQSANS tests should pass unchanged
    • Instruments without the image filepath PV should work as before

Expected behavior:

  • VENUS datafiles with the PV metadata.entry.daslogs.bl10:exp:im:imagefilepath.value will have their image files discovered and batch-cataloged
  • Other instruments continue to work as before with no changes
  • Large numbers of images (1500+) are efficiently cataloged in batches of 50

References

  • Prototype script: /home/dzj/Downloads/ingest-datafile (attached to original issue)
  • Related to IBM EWM work item: EWM # 14429

@codecov
Copy link

codecov bot commented Jan 15, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.88%. Comparing base (767c088) to head (6062c28).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #101      +/-   ##
==========================================
+ Coverage   81.40%   81.88%   +0.48%     
==========================================
  Files          16       16              
  Lines        1264     1292      +28     
==========================================
+ Hits         1029     1058      +29     
+ Misses        235      234       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@darshdinger darshdinger marked this pull request as ready for review January 20, 2026 17:58
@darshdinger darshdinger force-pushed the ewm14429_add_catalogging_of_images_for_VENUS branch from 6d2397b to 6062c28 Compare January 20, 2026 18:06
@darshdinger darshdinger merged commit 7243643 into main Jan 21, 2026
8 checks passed
@darshdinger darshdinger deleted the ewm14429_add_catalogging_of_images_for_VENUS branch January 21, 2026 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants