Skip to content

Conversation

chrishalcrow
Copy link
Member

Adds the ability to export a SortingAnalyzer or Sorting to a TsdGroup object from pynapple (https://pynapple.org/user_guide/01_introduction_to_pynapple.html#nap-tsgroup-group-of-timestamps). Made with advice from @gviejo, at NeuroDataReHack 2025!

Tests and docs included.

Some comments:

  • The user can specific metadata, but we decided to include some basic metadata if it was available (metrics and unit locations).
  • Pynapple always uses seconds as the time unit. I've implemented this using the return_times argument in get_unit_spike_train. This can be a bit confusing for the user - what happens with multi-segments etc? We really need to finish our time doc!!
  • We put the burden of installing pynapple on the users side.

Copy link
Member

@zm711 zm711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two initial comments and then lets see about the tests passing :)

elif isinstance(sorting_analyzer_or_sorting, BaseSorting):
sorting = sorting_analyzer_or_sorting
else:
raise TypeError("The function `to_pynapple_tsgroup` only accepts a SortingAnalyzer or Sorting object.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we return the type of the object that was accidentally given. For me I sometimes confuse the order of rec, sorting in generate_ground_truth_recording so it would be nice to know I put in a recording instead of a sorting/analyzer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a go, what do you think of the new message?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry never responded. This is great Thanks!

Copy link
Member

@zm711 zm711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to do the docs really quick :)


def to_pynapple_tsgroup(
sorting_analyzer_or_sorting: SortingAnalyzer | BaseSorting,
metadata: pd.DataFrame | dict | None = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also what if we let the user choose a list of metadata and then we organize it instead. Right? each unit could have location + qm + tm all in one big dataframe for Pynapple?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. Like, they specify "unit_locations", "quality_metrics", etc?

Copy link
Member

@zm711 zm711 Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed what you were doing. I see it now. Maybe that is a little too complicated. If they wanted to do the dataframe on their own could they add metadata to the pynapple structure later. Basically if they want to use the export correctly they have to put a None in for this argument. So doing a dict or dataframe is just hijacking it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there is a set_info method for the TsGroup. So we could ignore the metadata, and get the user to attach it after creation. Maybe we can discuss at the next meeting

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good. I don't think this comment style is letting my point come across clearly.

I like the idea of exporting metadata, but I don't like the idea of letting the user do a million different styles of metadata. I would prefer them to give a list of spikeinterface metadata they want that we could export for them. If they just want to set their own metadata then I don't think this function is for that so I think the finer point there is better to talk about then write about.

@zm711 zm711 added the exporters Related to exporters module label Jul 18, 2025
@zm711
Copy link
Member

zm711 commented Jul 18, 2025

We put the burden of installing pynapple on the users side.

Although fair this complicates the definition of pip install spikeinterface[full] even more for my rewrite.... Maybe we should really think about what this means.

Copy link
Member

@zm711 zm711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few more typos found. I still wonder about the metadata bit. But definitely better to just discuss in person.

@chrishalcrow
Copy link
Member Author

Hello, I met with the Pynapple team (@gviejo, @sjvenditto, @wulfdewolf) at FlatIron. We loosely talked about the PR, with two main discussion points:

  1. What to do with multi-segment sortings. We decided that each segment should output one TsGroup. We had agreed that a multi-segment sorting should then output a list or tuple of TsGroups. On implementation, I think it works better if the user has to input the segment_index e.g. my_tsgroup = si.to_pynapple_tsgroup(analyzer, segment_index=2). This matches some other functions in spikeinterface (e.g. get_unit_spike_train), forces multi-segment users to realise there is one TsGroup per segment and keeps the return value of the to_pynapple_tsgroup to a single type. And keeps the most-common use-case (one segment) simple.
  2. What metadata to pass to the TsGroup by default. The Pynapple team were in favour of “if it’s there, pass it on”. So that’s one vote for attaching any computed extension info that contain unit information (metrics + locations).

@zm711
Copy link
Member

zm711 commented Jul 24, 2025

What to do with multi-segment sortings. We decided that each segment should output one TsGroup. We had agreed that a multi-segment sorting should then output a list or tuple of TsGroups. On implementation, I think it works better if the user has to input the segment_index e.g. my_tsgroup = si.to_pynapple_tsgroup(analyzer, segment_index=2). This matches some other functions in spikeinterface (e.g. get_unit_spike_train), forces multi-segment users to realise there is one TsGroup per segment and keeps the return value of the to_pynapple_tsgroup to a single type. And keeps the most-common use-case (one segment) simple.

I support the selecting segment_index since it fits with the rest of the codebase.

What metadata to pass to the TsGroup by default. The Pynapple team were in favour of “if it’s there, pass it on”. So that’s one vote for attaching any computed extension info that contain unit information (metrics + locations).

I support this. My point here is I think it should either be a bool (export all possible spikeinterface stuff or export nothing) or a list of things you want to export (quality metrics, template metrics, etc). I just don't think we should allow the user to feed in an arbitrary dataframe that we will then export for them. So based on what the pynapple team is saying (hi to all of you!) I would make it an export bool.

@samuelgarcia samuelgarcia added this to the 0.103.0 milestone Jul 25, 2025
@samuelgarcia
Copy link
Member

This could be cool to have this for the relase no ? And we can improve a bit later.

@chrishalcrow
Copy link
Member Author

I support this. My point here is I think it should either be a bool (export all possible spikeinterface stuff or export nothing) or a list of things you want to export (quality metrics, template metrics, etc). I just don't think we should allow the user to feed in an arbitrary dataframe that we will then export for them. So based on what the pynapple team is saying (hi to all of you!) I would make it an export bool.

Ahhh, ok, got you! So if bool=False, the user should make the TsGroup, then add their own metadata to that using pynapple functionality? I like that!

@zm711
Copy link
Member

zm711 commented Jul 25, 2025

not for this PR, but if we are pursing this strategy in general we should add a note in the documentation that we would support export layers for projects that want to create their own export specific code. Might be worth discussion at the maintenance meeting just like we have a section in the sorter section explaining how to get your sorter added to the code base.

@zm711
Copy link
Member

zm711 commented Jul 25, 2025

Done dirty by hugging face :P

pyproject.toml Outdated
@@ -141,6 +141,7 @@ test_extractors = [
# Commenting out for release
"probeinterface @ git+https://github.com/SpikeInterface/probeinterface.git",
"neo @ git+https://github.com/NeuralEnsemble/python-neo.git",
"pynapple",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REMOVE!!!

Copy link
Member

@zm711 zm711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 typo
1 ? style

2 questions regarding the int try-except


unit_ids_castable = True
try:
unit_ids_ints = [int(unit_id) for unit_id in unit_ids]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another strategy that wouldn't use the try-except would be

unit_ids_castable = all([unit_id.isdigit() for unit_id in unit_ids])
if unit_ids_castable:
    unit_ids_ints = [int(unit_id) for unit_id in unit_ids]
else:
xx

Copy link
Member

@zm711 zm711 Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind. This only works if the ids are string. --which maybe in the future they will be ;P

Copy link
Member Author

@chrishalcrow chrishalcrow Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought about this too and ended up try/excepting.
all([isinstance(unit_id, int) or unit_id.isdigit() for unit_id in unit_ids])
works but is a bit gross.

I'd vote to keep the try/except for now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The costs are minimal, but based on my reading a try-except is slightly faster than an if-else if you succeed most of the time, but is quite a bit slower if you except often. That being said even a 10x slowdown of one step isn't really that meaningful. so now that you added in the specific except I'm okay with this. Thanks for humoring me :)

Copy link
Member

@zm711 zm711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After putting you through the ringer this is good by me. :)

@samuelgarcia samuelgarcia merged commit f258428 into SpikeInterface:main Jul 29, 2025
15 checks passed
@gviejo
Copy link

gviejo commented Jul 29, 2025

🙌 🎉

@chrishalcrow chrishalcrow deleted the export-to-pynapple branch July 29, 2025 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exporters Related to exporters module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants