Data compression for screen data by schewskone · Pull Request #81 · sensorium-competition/experanto

schewskone · 2025-06-02T20:00:43Z

Implemented support for compressed screen formats

Added EncodedImageTrial and EncodedVideoTrial which feature get_data_() methods that support encoded formats.
Extended screen tests and screen data generation to feature encoded data in mp4 and jpeg format.
Thinned requirements.txt and added ffmpeg installation to CI workflow.

Right now the VideoDecoder decodes the entire video and returns that to the interpolation function. This can be optimized with the help of sliced decoding which would require changes to the interpolation method. @pollytur and I decided we should discuss first and implement it later on if deemed necessary.

…ges and mp4 videos

…n image arrays

… for compatibility with allen-exporter and future datasets to avoid duplicate files

Adding small fix for ToTensor issue to decoder PR because seperate PR is unnessesary

.github/workflows/test.yml

experanto/interpolators.py

requirements.txt

Copilot

Pull Request Overview

This PR adds support for compressed screen data by extending trials to handle encoded images and videos, updating tests and data generation to work with JPEG/MP4, and adjusting interpolation interfaces and CI.

Introduce EncodedImageTrial and EncodedVideoTrial with compressed data loaders
Enhance test utilities and screen interpolation tests for both encoded and raw formats
Update interpolation API to return only data, adjust downstream consumers, and add FFmpeg to CI

Reviewed Changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/test_sequence_interpolator.py	Update tests for new single-array return value of `interpolate`
tests/test_screen_interpolator.py	Add parameterized tests for encoded vs. raw screen data
tests/create_screen_data.py	Extend data generation to save JPEG/MP4 and emit metadata
experanto/interpolators.py	Remove `valid` returns, introduce `image_names` flag, add encoded trials
experanto/experiment.py	Adjust `interpolate` to drop `valid` output
experanto/datasets.py	Update dataset pipelines to match new return signature
configs/default.yaml	Add `image_names` configuration
.github/workflows/test.yml	Install FFmpeg for encoding dependencies

Comments suppressed due to low confidence (4)

experanto/interpolators.py:152

The return type annotation still indicates a tuple, but the method now returns only a single array. Update the type hint and docstring to reflect the new signature.

    def interpolate(self, times: np.ndarray) -> tuple[np.ndarray, np.ndarray]:

experanto/interpolators.py:425

[nitpick] Using format shadows the built-in Python function. Consider renaming this variable to file_format to avoid confusion.

            format = metadata.get("file_format")

tests/create_screen_data.py:14

The new encoded parameter isn't documented. Please update the function docstring to explain what encoded does and what formats it controls.

def create_screen_data(

tests/create_screen_data.py:118

The code calls shutil.rmtree but shutil is not imported. Add import shutil at the top of the module.

        shutil.rmtree(SCREEN_ROOT)

experanto/interpolators.py

experanto/datasets.py

experanto/interpolators.py

tests/create_screen_data.py

tests/test_screen_interpolator.py

pollytur · 2026-03-02T14:02:06Z

tests/create_screen_data.py


-        frames = np.random.rand(n_frames, *frame_shape).astype(np.float32)
+        # Generate frames with values in [0, 255] for better encoding
+        frames = (np.random.rand(n_frames, *frame_shape) * 255).astype(np.uint8)


@schewskone why is it now np.uint8 and not float32?

pollytur · 2026-03-02T14:05:45Z

tests/test_sequence_interpolator.py

I get not all of the asserts fro valid should be removed here.
@schewskone could you please take a look and plug some of them back, f.e. line 37 and 41-43 probably should stay, same for 71 and 75-77

(I merged it with the current main - check if you disagree with stn)

pollytur · 2026-03-02T14:16:38Z

tests/create_screen_data.py

+                file_ext = ".jpg"
+            else:
+                # Save as numpy array (original behavior)
+                np.save(data_dir / f"{i:05d}.npy", frames[i].astype(np.uint8))


@schewskone it used to be np.float32 here in test as well - why have you changed it and does it match the real exported data?

Also, why there are no grayscale .jpeg saved? (e.g. 1 channel ones)

pollytur · 2026-03-02T14:35:20Z

experanto/interpolators.py

+                    # Initialize the decoder only once per unique file
+                    # Assuming ScreenTrial.create or a helper can return just a decoder
+                    if self.device == "cuda":
+                        with set_cuda_backend("beta"):


@schewskone could you please add a comment here what set_cuda_backend("beta") does? and is it necessary to have it here because moving decoded data from CPU to GPU after interpolating is much slower? have you profiled it?

pollytur · 2026-03-02T14:38:53Z

experanto/interpolators.py

+
+            # 2. Logic to handle shared video decoders
+            decoder_to_use = None
+            if file_format in [".mp4", ".avi", ".mov"]:  # Add your video formats here


@schewskone should we check that a file_format is .npy otherwise and throw a NonImplemented error is not?

pollytur · 2026-03-02T14:53:30Z

experanto/interpolators.py

+                if data_file_name not in shared_decoders:
+                    # Initialize the decoder only once per unique file
+                    # Assuming ScreenTrial.create or a helper can return just a decoder
+                    if self.device == "cuda":


what is device cuda:0? I would suggest checking if device contains cuda

pollytur · 2026-03-02T14:54:54Z

experanto/interpolators.py

+    def _initialize_decoder(self, data_file_name):
+        decoder = VideoDecoder(
+            str(data_file_name),


@schewskone why do you need to cast it to str? could it be sth else?

Suggested change

def _initialize_decoder(self, data_file_name):

decoder = VideoDecoder(

str(data_file_name),

def _initialize_decoder(self, data_file_name: str):

decoder = VideoDecoder(

data_file_name,

experanto/experiment.py

pollytur · 2026-03-02T14:59:43Z

experanto/interpolators.py

        )


+class EncodedImageTrial(ScreenTrial):


@schewskone where is it actually used? I cannot find it in being called explicitly but I might be missing sth (the only place I see it could be called now is lines 782-783)

pollytur · 2026-03-02T15:01:23Z

experanto/interpolators.py

+        cls = globals()[class_name]
+
+        # Pass shared_decoder only for EncodedVideoTrials
+        if cls is EncodedVideoTrial:


@schewskone could we please add a small comment here why EncodedImageTrial don't need a shared decoder?

pollytur · 2026-03-02T15:11:40Z

experanto/interpolators.py

+
+    def get_data_(self) -> np.array:
+        """Override base implementation to load compressed images"""
+        img = cv2.imread(str(self.data_file_name))  # returns BGR


I guess here even if the original .jpg is 1 channel (grayscale) image - cv2.imread would upload 3 duplicated channels. Do we want to catch it somehow, @schewskone (Warnings? or set a flag to cast things to 1 channel if the user knows their jpg are grayscale?)?

pollytur · 2026-03-02T15:15:10Z

experanto/interpolators.py

+        if img is None:
+            raise ValueError(f"Could not read image file: {self.data_file_name}")
+        # Convert BGR to RGB
+        img = img[:, :, [2, 1, 0]]


@schewskone why not cv2.cvtColor(img, cv2.COLOR_BGR2RGB)? I'm not 100% sure but quick googling says that cv2 should be faster (both create a contagious copy of the image though)

pollytur · 2026-03-02T16:33:35Z

experanto/interpolators.py

+
+    def get_data_(self, frame_indices=None) -> np.array:
+        """Override base implementation to load compressed videos"""
+        # Frame_indices is only ever None for caching purposes, we then decode entire video and cache it


should we maybe add an assert then, that if self._cached_data is None then frame_indices should not be None?

pollytur · 2026-03-02T16:34:39Z

experanto/interpolators.py

+        frames = frames[:, [2, 1, 0], ...]
+        # Reorder dimensions
+        frames = frames.permute(0, 2, 3, 1).contiguous()  # T,H,W,C


why not to do it together? because it looks like some dimensions end up being moved twice

pollytur · 2026-03-02T16:37:53Z

experanto/interpolators.py

+                "EncodedVideoTrial requires a shared_decoder to be provided."
+            )
+
+    def get_data(self, frame_indices) -> np.array:


Suggested change

def get_data(self, frame_indices) -> np.array:

def get_data(self, frame_indices) -> np.ndarray:

pollytur · 2026-03-02T16:41:51Z

experanto/interpolators.py

+        out = out.transpose(
+            0, 3, 1, 2
+        )  # transform into (T, C, H, W) after finishing with Cv2 operation


@schewskone should this reordering really be here? because it has not been here before. is it back compatible with the numpy files?

pollytur

please look at the comments - at least test need to plug back valid checks for sure

pollytur · 2026-03-08T11:10:23Z

requirements.txt

+# Core ML CPU only version
+torch==2.9.0
+torchvision==0.24.0
+torchaudio==2.9.0


@schewskone do we really need torchaudio for this PR? i dont think its used anywhere, is it?

pollytur · 2026-03-08T11:11:26Z

requirements.txt

+torch==2.9.0
+torchvision==0.24.0


do we really need to upgrade torch from 2.7 and torchvision from 2.22?

pollytur · 2026-03-08T11:11:39Z

requirements.txt

+
 hydra-core==1.3.2
-numpy==2.2.5
+numpy==2.2.6


Also, does numpy 2.2.6 support python 3.9 or do we need to upgrade to 3.10?

torch 2.9 and codec 0.9 required 3.10 and above https://github.com/meta-pytorch/torchcodec?tab=readme-ov-file#installing-torchcodec

pollytur

@schewskone I merged main into it again and made sure that the CI/CD passes (not sure why its not displayed in the PR but its here)

Please address the rest of the comments

schewskone and others added 5 commits June 2, 2025 21:31

added support for compressed data formats and added tests for jpg ima…

1dae16f

…ges and mp4 videos

style: auto-format with black and isort

f2b906f

added torchcodec to requirements

406b9a9

added ffmpeg to github actions setup

34c88c7

Merge branch 'sensorium-competition:main' into compressed_data

902f3ca

schewskone requested review from pollytur and reneburghardt June 2, 2025 20:00

schewskone mentioned this pull request Jun 2, 2025

Add capabilities for decoding/decompression of images/videos #31

Open

schewskone and others added 6 commits June 13, 2025 14:00

replaced inproper usage of torchvision functions for transforms on no…

c009980

…n image arrays

style: auto-format with black and isort

8a5314c

added option to read stimulis via image_names instead of index_naming…

76ded2d

… for compatibility with allen-exporter and future datasets to avoid duplicate files

style: auto-format with black and isort

ad1f7cb

adjusted variable namings

9a525b6

Merge remote-tracking branch 'origin/ToTensor_fix' into compressed_data

98c9924

Adding small fix for ToTensor issue to decoder PR because seperate PR is unnessesary

schewskone mentioned this pull request Jul 3, 2025

Replace deprecated ToTensor #77

Open