-
-
Notifications
You must be signed in to change notification settings - Fork 14
Improve guessing of the mapping of dimred slots #333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: devel
Are you sure you want to change the base?
Conversation
as_SingleCellExperiment() conversion
7fabfd0 to
8572eb9
Compare
as_SingleCellExperiment() conversionas_SingleCellExperiment() conversion
as_SingleCellExperiment() conversionThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR improves the automatic mapping of dimensionality reduction slots when converting between AnnData, SingleCellExperiment, and Seurat objects. It introduces a centralized mapping system that standardizes naming conventions across the three frameworks.
- Adds a centralized dimensionality reduction mapping system via
common_dimred_mappings.R - Updates conversion functions to use standard naming conventions (e.g., "PCA"/"UMAP" for SCE, "pca"/"umap" for Seurat, "X_pca"/"X_umap" for AnnData)
- Fixes test cases and documentation to reflect the new naming conventions
Reviewed Changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| R/common_dimred_mappings.R | New centralized mapping system for dimensionality reduction naming conventions |
| R/as_SingleCellExperiment.R | Updated to use centralized mappings and proper SCE naming conventions |
| R/as_Seurat.R | Updated to use centralized mappings and proper Seurat naming conventions |
| R/from_SingleCellExperiment.R | Updated conversion logic to use centralized reverse mappings |
| R/from_Seurat.R | Updated conversion logic to use centralized reverse mappings |
| R/known_issues.R | Added guard for empty known_issues data frame |
| inst/known_issues.yaml | Removed resolved PCA conversion issue |
| tests/testthat/test-*.R | Updated test expectations to match new naming conventions |
| vignettes/usage_singlecellexperiment.Rmd | Updated examples to use proper naming conventions |
| NEWS.md | Added entry documenting the improvement |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
I'm not sure how I feel about this. I can see that it's nice to do this automatically but I don't like the idea of making up rules for specific cases. We already have a mechanism to let the user set how things are names so do we need this as well (also, I should check how those two interact)? @LouiseDck What do you think? |
|
Indeed -- this code only applies to when we are guessing the mapping (i.e. when the user didn't provide a manual mapping). There was already some hardcoded code for PCA to make sure the right fields end up in Seurat in the right place; this PR just abstracts it out a little for the most common DRs. |
|
I dislike that this is a special case where we guess the mapping, we do not do this for other cases (such as nearest neighbors graphs)? So I'm also a bit conflicted. Was there a specific reason why you initiated this PR @rcannood? |
Good question! Yes, because I was trying to fix a known issue related to SCE conversion ^^ |
Can't you just do
Isn't this related to the dimnames though, not the key? |
Huh indeed, thanks! It's the |
You're right. That doesn't seem ideal so I've opened an issue scverse/scanpy#3803. I think that's more a scanpy problem though so I don't think it should affect how we name things. |
Thanks, I probably should've done this a while ago 😅 This discussion hinges on how we want to approach conversion, and how much semantics we want to guess, right? One the one hand I do think it makes sense: reducing the barriers for people to work with different data structures is in general a good idea, and this allows people to be less familiar with certain intricacies and conventions of different data structures. But, people might be surprised to see that their dimreds changed name? I think I might be slightly in favor of doing the automatic conversion? But I can see upsides and downsides for sure. |
|
I can also see both sides but I guess I lean the other way. My feeling is that once you start doing this you have to make a lot of decisions about how things get mapped which I would prefer to avoid. You also have to make sure that everything is obvious to the user, it can be overridden and it works in both directions. Probably the only thing that would convince me is if a core package required specific names so that users always have to set the mapping. |
|
To be clear, this PR only affects the code that tries to "guess" how AnnData's should be mapped to SCE/Seurat and back. If the user manually chooses how DRs should be mapped to obsm, the code in this PR doesn't trigger. Maybe it's somewhat related to the feedback we got related to the Bioconductor submission (#342) -- the most important part is probably that the user knows how certain things are mapped to each other and how to override it |
Related to:
Description
Checklist
Before review
Before merge
NEWS