Skip to content

Conversation

@rcannood
Copy link
Member

@rcannood rcannood commented Sep 9, 2025

Related to:

Description

Checklist

Before review

  • Update and regenerate man pages
  • Add/update tests
  • Add/update examples in vignettes
  • Pass CI checks

Before merge

  • Update NEWS
  • Bump devel version

@rcannood rcannood changed the title Fix as_SCE Fix mapping of embeddings during as_SingleCellExperiment() conversion Sep 9, 2025
@rcannood rcannood changed the title Fix mapping of embeddings during as_SingleCellExperiment() conversion [WIP] Fix mapping of embeddings during as_SingleCellExperiment() conversion Sep 9, 2025
@rcannood rcannood changed the title [WIP] Fix mapping of embeddings during as_SingleCellExperiment() conversion Improve mapping of dimred slots Sep 15, 2025
@rcannood rcannood changed the title Improve mapping of dimred slots Improve guessing the mapping of dimred slots Sep 15, 2025
@rcannood rcannood requested review from Copilot and lazappi September 15, 2025 20:20
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves the automatic mapping of dimensionality reduction slots when converting between AnnData, SingleCellExperiment, and Seurat objects. It introduces a centralized mapping system that standardizes naming conventions across the three frameworks.

  • Adds a centralized dimensionality reduction mapping system via common_dimred_mappings.R
  • Updates conversion functions to use standard naming conventions (e.g., "PCA"/"UMAP" for SCE, "pca"/"umap" for Seurat, "X_pca"/"X_umap" for AnnData)
  • Fixes test cases and documentation to reflect the new naming conventions

Reviewed Changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated no comments.

Show a summary per file
File Description
R/common_dimred_mappings.R New centralized mapping system for dimensionality reduction naming conventions
R/as_SingleCellExperiment.R Updated to use centralized mappings and proper SCE naming conventions
R/as_Seurat.R Updated to use centralized mappings and proper Seurat naming conventions
R/from_SingleCellExperiment.R Updated conversion logic to use centralized reverse mappings
R/from_Seurat.R Updated conversion logic to use centralized reverse mappings
R/known_issues.R Added guard for empty known_issues data frame
inst/known_issues.yaml Removed resolved PCA conversion issue
tests/testthat/test-*.R Updated test expectations to match new naming conventions
vignettes/usage_singlecellexperiment.Rmd Updated examples to use proper naming conventions
NEWS.md Added entry documenting the improvement

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@lazappi
Copy link
Collaborator

lazappi commented Sep 16, 2025

I'm not sure how I feel about this. I can see that it's nice to do this automatically but I don't like the idea of making up rules for specific cases. We already have a mechanism to let the user set how things are names so do we need this as well (also, I should check how those two interact)?

@LouiseDck What do you think?

@rcannood
Copy link
Member Author

Indeed -- this code only applies to when we are guessing the mapping (i.e. when the user didn't provide a manual mapping).

There was already some hardcoded code for PCA to make sure the right fields end up in Seurat in the right place; this PR just abstracts it out a little for the most common DRs.

@rcannood rcannood changed the title Improve guessing the mapping of dimred slots Improve guessing of the mapping of dimred slots Sep 16, 2025
@LouiseDck
Copy link
Collaborator

I dislike that this is a special case where we guess the mapping, we do not do this for other cases (such as nearest neighbors graphs)?
However, I do think it might really make users life easier as for example, it is still not easy to just plot any dimred using scanpy if it does not have the "X_" prepended.

So I'm also a bit conflicted.

Was there a specific reason why you initiated this PR @rcannood?

@rcannood
Copy link
Member Author

rcannood commented Sep 16, 2025

Was there a specific reason why you initiated this PR @rcannood?

Good question!

Yes, because I was trying to fix a known issue related to SCE conversion ^^

https://github.com/scverse/anndataR/pull/333/files#diff-1589871b950e902a7f2ead4bd87e8c6f2c6979f50fd334851f521b1d180c6ca5

@lazappi
Copy link
Collaborator

lazappi commented Sep 17, 2025

I dislike that this is a special case where we guess the mapping, we do not do this for other cases (such as nearest neighbors graphs)? However, I do think it might really make users life easier as for example, it is still not easy to just plot any dimred using scanpy if it does not have the "X_" prepended.

Can't you just do scanpy.pl.embedding(adata, basis="whatever"). I don't think it checks for "X_*".

Yes, because I was trying to fix a known issue related to SCE conversion

Isn't this related to the dimnames though, not the key?

@LouiseDck
Copy link
Collaborator

I dislike that this is a special case where we guess the mapping, we do not do this for other cases (such as nearest neighbors graphs)? However, I do think it might really make users life easier as for example, it is still not easy to just plot any dimred using scanpy if it does not have the "X_" prepended.

Can't you just do scanpy.pl.embedding(adata, basis="whatever"). I don't think it checks for "X_*".

Huh indeed, thanks! It's the basis argument in scanpy.pl.scatter that prepends the X_.

@lazappi
Copy link
Collaborator

lazappi commented Sep 18, 2025

Huh indeed, thanks! It's the basis argument in scanpy.pl.scatter that prepends the X_.

You're right. That doesn't seem ideal so I've opened an issue scverse/scanpy#3803. I think that's more a scanpy problem though so I don't think it should affect how we name things.

@LouiseDck
Copy link
Collaborator

Huh indeed, thanks! It's the basis argument in scanpy.pl.scatter that prepends the X_.

You're right. That doesn't seem ideal so I've opened an issue scverse/scanpy#3803. I think that's more a scanpy problem though so I don't think it should affect how we name things.

Thanks, I probably should've done this a while ago 😅

This discussion hinges on how we want to approach conversion, and how much semantics we want to guess, right?

One the one hand I do think it makes sense: reducing the barriers for people to work with different data structures is in general a good idea, and this allows people to be less familiar with certain intricacies and conventions of different data structures. But, people might be surprised to see that their dimreds changed name?
On the other hand, is it desirable that people are unaware of how their data is stored? Also, with more of this "magic" conversion, it gets increasingly difficult for the user to keep a mental model of what happens during the conversion?

I think I might be slightly in favor of doing the automatic conversion? But I can see upsides and downsides for sure.

@lazappi
Copy link
Collaborator

lazappi commented Sep 19, 2025

I can also see both sides but I guess I lean the other way. My feeling is that once you start doing this you have to make a lot of decisions about how things get mapped which I would prefer to avoid. You also have to make sure that everything is obvious to the user, it can be overridden and it works in both directions.

Probably the only thing that would convince me is if a core package required specific names so that users always have to set the mapping.

@rcannood
Copy link
Member Author

To be clear, this PR only affects the code that tries to "guess" how AnnData's should be mapped to SCE/Seurat and back. If the user manually chooses how DRs should be mapped to obsm, the code in this PR doesn't trigger.

Maybe it's somewhat related to the feedback we got related to the Bioconductor submission (#342) -- the most important part is probably that the user knows how certain things are mapped to each other and how to override it

@lazappi lazappi added design Discussion about how things should be designed question Further information is requested labels Nov 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

design Discussion about how things should be designed question Further information is requested

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants