Skip to content

Conversation

@ntjohnson1
Copy link
Member

I'll link a PR from the main rerun repo. Basically when loading a lerobot dataset with a single frame for a video I got an error about the wrong number of rows for a chunk.

The STSZ format uses sample_sizes when the are not uniform but sample_size when they are. We weren't accounting for the uniform case and just skipped the samples.

Comment on lines 168 to 171
// Could probably just always use sample count
while (sample_n < stsz.sample_sizes.len() && sample_n == 0)
|| sample_n < stsz.sample_count as usize
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a bit weird 👀

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be && stsz. sample_size == 0 oops

Either sample_sizes should exist as a non-empty vector and sample_size is 0 (showing unused) or sample_size is a constant across all samples. I think we can just always use sample_count but I couldn't find a sufficiently convincing spec to reference (didn't look that hard)

Copy link
Member

@Wumpf Wumpf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ongoing internal discussion - Rerun is (supposed to) handle this internally today, see e.g. this discussion https://github.com/rerun-io/rerun/blob/4546c13cf1c152981b1d5636fc29a009f8529fb3/crates/utils/re_video/src/demux/mod.rs#L898-L899

putting 'request for changes' until we're sure this is the right fix :)

@Wumpf Wumpf self-requested a review September 16, 2025 11:43
Copy link
Member

@Wumpf Wumpf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got this totally wrong 😳 . After talking to Nick I agree that this is the right fix
Let's add a changelog entry and publish a new re_mp4 - suggesting to not make it a patch release for the sole reason of explicitely opting in to the new version (if we make it a patch, cargo installs of old Rerun versions will pick it up, let's not risk that :))

@ntjohnson1 ntjohnson1 mentioned this pull request Sep 16, 2025
@ntjohnson1 ntjohnson1 merged commit de3346f into main Sep 16, 2025
7 checks passed
ntjohnson1 added a commit to rerun-io/rerun that referenced this pull request Sep 16, 2025
### Related

rerun-io/re_mp4#20

### What
For mp4 videos of 1 frame rerun fails to load them at all with a weird
error about indexes needing to be the same length
* I fixed something that caused us to skip frames when they are all the
same size (not sure how common that is outside of single frame samples).
Right now this points to that branch to demo nothing breaks. Will need
to update to latest version once we update the crate.

Pointing at the same directory containing episode 1 and episode 58
Before (Where we both couldn't load episode 58 ever but also couldn't
load it because our vector was of length 2):
<img width="2032" height="1167" alt="Screenshot 2025-09-15 at 5 47
46 PM"
src="https://github.com/user-attachments/assets/db33ec58-7432-4069-bb5b-a5f02f4ba356"
/>

After (where things appear to work):
<img width="2032" height="1167" alt="Screenshot 2025-09-15 at 5 52
05 PM"
src="https://github.com/user-attachments/assets/b24c823c-4051-4d5f-a1bc-8a5cc39f9a67"
/>

NOTE: I put a link to the re_mp4 branch for validation but I assume we
will land that then update the version here.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working include in changelog Will appear in CHANGELOG.md

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants