Add intan reader for concatenated files #4070

h-mayorquin · 2025-07-17T06:05:17Z

This is yet another modality of intan where files are partitioned by the acquisition system. This reader is a convenience for the users so they can readily load their data as it is.

for more information, see https://pre-commit.ci

alejoe91 · 2025-07-17T06:45:22Z

src/spikeinterface/extractors/extractor_classes.py

@@ -200,5 +201,6 @@
        "read_binary",  # convenience function for binary formats
        "read_zarr",
        "read_neuroscope",  # convenience function for neuroscope
+        "read_intan_segmented",  # convenience function for segmented intan files


Awesome! Can you add this to the docs extracors and API as well?

Should be done.

zm711 · 2025-07-17T12:24:06Z

I think the PR is in good shape but I have a few motivation questions.

Why a separate extractor? Could we add a mode into the normal Intan?
Segment is loaded in spikeinterface. What this is actually is a concatenated version so maybe we should think about the naming
why not give the user the option to append or concatenate? Maybe they did the splits on purpose (ie they did baseline, experiment, post time period. So they would prefer to append instead. Why should we create a convenience function for only one of the two options and then tell the user if you want to do the other then do it yourself.
I could imagine other readers also have these format styles (though I don't have an example off the top of my head) so why preference Intan (other than you and I work on it a lot)
Why not do a dir mode at the Neo level where you make each file a Neo segment instead? Why here?

h-mayorquin · 2025-07-17T14:43:11Z

I think the PR is in good shape but I have a few motivation questions.

Why a separate extractor? Could we add a mode into the normal Intan?

This is my preference to keep the extractor simple. Otherwise, I would need to add logic for which file to select, more arguments a more complicated docstring, etc. This way I highlight that this is an special case by construction and I think makes the implementation simpler.

Segment is loaded in spikeinterface. What this is actually is a concatenated version so maybe we should think about the naming

Yes, open to other naming suggestions.

why not give the user the option to append or concatenate? Maybe they did the splits on purpose (ie they did baseline, experiment, post time period. So they would prefer to append instead. Why should we create a convenience function for only one of the two options and then tell the user if you want to do the other then do it yourself.

I think that the splits are always continuous on Intan, aren't they? The idea is to load this data has the acquisition system produced it. If you have an option that indeed produces the files with gaps in between then I think we should have an option as you suggest.

I could imagine other readers also have these format styles (though I don't have an example off the top of my head) so why preference Intan (other than you and I work on it a lot)

Yes, if other readers also produce segmented files with equivalent logic, there is a clear path for appending (like here by filename), and we have testing data I think we should offer something similar. Nothing special about Intan other than being on my to-do list as I know the format and I have data. (I don't work with Intan so much by the way, I used it in a project at the start of the year and that's it, but I would say I know the format well now).

Why not do a dir mode at the Neo level where you make each file a Neo segment instead? Why here?

This is the quickest way to provide the feature to users. I am not sure how to do it in neo and now that I think about it the implementation in neo is already too complicated for my liking. This is how I feel.

zm711 · 2025-07-17T14:45:48Z

I think that the splits are always continuous on Intan, aren't they? The idea is to load this data has the acquisition system produced it. If you have an option that indeed produces the files with gaps in between then I think we should have an option.

True they are continuous in time. But the experimenter could want them to be same segment or separate segment for their own provenance. Here you are forcing them to keep them in one segment. Why? Concatenate vs append is not based on time it is based on wanting them in separate segments for whatever reason you want them separate.

h-mayorquin · 2025-07-17T14:50:23Z

True they are continuous in time. But the experimenter could want them to be same segment or separate segment for their own provenance. Here you are forcing them to keep them in one segment. Why? Concatenate vs append is not based on time it is based on wanting them in separate segments for whatever reason you want them separate.

I am not forcing them, they still can just load the single segments on intan and use append themselves.

I am not providing a utility for them to do this easily because I come from a different place: The idea for me is that we should have recordings that extract the data in the way that the acquisition system "intended" it and tools for allowing the users to modify that on top of it to whatever representation they like. Loading data as a continuous segment is the former, loading data as different segments is the latter.

zm711 · 2025-07-17T14:55:57Z

I am not providing a utility for them to do this easily because I come from a different place: The idea for me is that we should have recordings that extract the data in the way that the acquisition system "intended" it

I think we don't know how acq system "intended" in this case. The software lets the user choose their cutoff length, which means that the software could be allowing the user to intend for this to be individual segments. I think the argument in your favor is that Intan provides a post-hoc stitching software that can make a series of continuous recordings one giant recording (which would be read as a neo mono-segement (ie I don't think there is an initial intention about concatenate vs append, but there is a post-hoc intention for concatenate).

I think we addressed (1), (4), and (5). We disagree about (3) but I'm willing to be overruled. let me think about naming for (2).

zm711 · 2025-07-17T14:58:15Z

src/spikeinterface/extractors/neoextractors/intan.py

+        stream_name=None,
+        all_annotations=False,
+        use_names_as_ids=False,
+        ignore_integrity_checks: bool = False,


If we go through with this I think we have to remove the ignore_integrity_checks. I think this PR would need to ensure that the recordings are truly continuous no? So two step process.

Make sure no individual file is broken

ensure that the timestamp files are actually continuous

Otherwise you're running into your issue with the sampling rates? This PR could be used to stitch together any arbitrary intan files when we should have a way to do this like Intan does it.

Why? I think it is fine if they want to load some of the files even if they gaps within. I don't see a strong reason to limit user choice.

One corrupt file could mess up everything right? How would the user know which file was corrupt?

I guess what I was really proposing was shouldn't we check what we are stitching are actually continuous or do you want people to be able to just pile in a bunch of random Intan files and try to stitch them together with this?

h-mayorquin · 2025-07-17T15:02:51Z

I don't feel that strong about 3, I was just explaining why I decided to do it this way. I am fine with adding that functionality (automatically creating a mega-segmented file) but not on this PR as that implementation would become more complicated and I don't have time to do it at the moment.

h-mayorquin · 2025-07-17T15:04:01Z

as for the naming, what about read_intan_split or something like that?

zm711 · 2025-07-17T15:08:36Z

src/spikeinterface/extractors/neoextractors/intan.py

+            recording_list.append(recording)
+
+        # Initialize the parent class with the recording list
+        ConcatenateSegmentRecording.__init__(self, recording_list)


Wouldn't it just be a boolean check here with an AppendSegmentRecording?

No, it inherits from the respective class.

I missed that. Yeah I was thinking you were just taking the concatenate_recording, for my own learning what is the benefit of doing this way?

We need a class for _kwargs and pickling.

zm711 · 2025-07-17T15:11:53Z

as for the naming, what about read_intan_split or something like that?

yeah we could riff on that. like read_split_intan_files, read_split_intan. The issue is the gui software isn't super clear with a naming convention. It is just a setting. So for official intan terminology we have "Traditional" with splits?

h-mayorquin · 2025-07-17T16:40:09Z

Changed the naming.

add new read intan

72267ee

h-mayorquin requested a review from zm711 July 17, 2025 06:05

pre-commit-ci bot and others added 3 commits July 17, 2025 06:05

[pre-commit.ci] auto fixes from pre-commit.com hooks

5ce7e7b

for more information, see https://pre-commit.ci

intan

9159c05

use class approach

b36d287

h-mayorquin marked this pull request as ready for review July 17, 2025 06:37

alejoe91 reviewed Jul 17, 2025

View reviewed changes

alejoe91 added the extractors Related to extractors module label Jul 17, 2025

docs

a9f8f0c

zm711 reviewed Jul 17, 2025

View reviewed changes

naming

576953d

Merge branch 'main' into add_intan_for_concatenated_files

46f8001

zm711 mentioned this pull request Jul 18, 2025

Add append mode to IntanSplitFile building on #4070 #4075

Open

h-mayorquin closed this Jul 24, 2025

h-mayorquin deleted the add_intan_for_concatenated_files branch July 24, 2025 16:09

Add intan reader for concatenated files #4070

Add intan reader for concatenated files #4070

Conversation

h-mayorquin commented Jul 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zm711 commented Jul 17, 2025

Uh oh!

h-mayorquin commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zm711 commented Jul 17, 2025

Uh oh!

h-mayorquin commented Jul 17, 2025

Uh oh!

zm711 commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

h-mayorquin commented Jul 17, 2025

Uh oh!

h-mayorquin commented Jul 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zm711 commented Jul 17, 2025

Uh oh!

h-mayorquin commented Jul 17, 2025

Uh oh!

Uh oh!

h-mayorquin commented Jul 17, 2025 •

edited

Loading

zm711 commented Jul 17, 2025 •

edited

Loading