-
-
Notifications
You must be signed in to change notification settings - Fork 14
Zarr support #190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: devel
Are you sure you want to change the base?
Zarr support #190
Conversation
…dataR into keller-mark/zarr
…milar to HDF5AnnData
Simplify how obs and var names handled in ZarrAnnData
More zarr-related changes
Update comments
rcannood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic work @keller-mark and @Artur-man !
I went through the PR for a first time and left some minor comments. I will review the code by running it a couple of times next :)
R/read_zarr_helpers.R
Outdated
| attrs <- g$get_attrs()$to_list() | ||
|
|
||
| if (!all(c("encoding-type", "encoding-version") %in% names(attrs))) { | ||
| path <- name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where are a lot of linting issues in this file -- could you run lintr::lint_package() and fix any issues that pop up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done ... did a full lint_package check and corrected some R check issues too!
inst/extdata/example2.zarr/.zgroup
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file should probably be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
I have quickly checked if
Will be in touch to see if these are resolved in the future, otherwise no zarr R package is currently both in CRAN/BioC and functionally complete yet. |
|
There is some progress in the Rarr package, would you guys like a clean PR (since there were so many updates since) or continuing here is fine ? |
Probably whatever is easiest for you and @keller-mark/whatever makes the PR easiest to understand. There have been a lot of changes to the package since this was opened so we would need to make sure those are included here. I saw that {Rarr} is planning to have Zarr v3 support for the next release so I think that makes sense in terms of which backend package to use. |
|
I adapted the changes, was fairly easy. I will then continue here and refer back to Hugo if needed again. |
|
I'm going to make this a draft for now, just to help with our organisation. Please let us know when it's ready to review. |
|
Guys, I am close .... I will add a new anndataR/inst/scripts/example_h5ad.py Lines 1 to 7 in 5ffd79a
|
|
We tried to cover as many cases as possible in the example dataset so I would probably modify the script to also output a (tarred/zipped) Zarr as well as an H5AD (and maybe rename it to something like I noted down the versions just in case it became important later but maybe it makes sense to store them in a separate file. If you are regenerating the dataset I would probably also update the environment. I think there is also a version number in the script that should be bumped and a changelog that should be updated. |
|
I have tested this in a couple of anndata examples as well as test datasets from https://github.com/HelenaLC/SpatialData and https://github.com/HelenaLC/SpatialData.data, and it seems to be working. We are currently waiting for structure data and scalar support from Rarr: |
Fixes #91
These changes are from both me and @Artur-man
The main public-facing changes here are:
ZarrAnnDataclassread_zarrandwrite_zarrtop-level functionsfrom_Seurat(output_class="ZarrAnnData")from_SingleCellExperiment(output_class="ZarrAnnData")Internally:
read_zarr_helpers.Ris the zarr analog ofread_h5ad_helpers.Rwrite_zarr_helpers.Ris the zarr analog ofwrite_h5ad_helpers.Rinst/extdata/example.zarr(this makes the diff noisy, apologies)test-Zarr-read.R(35 new tests)test-Zarr-write.R(70)test-ZarrAnnData.R(26)test-h5ad-zarr.R(17)A number of these functions generate warnings in the R console that are intended to be followed up on to improve the code (and should probably be resolved as end users may not appreciate them), but the tests still pass despite these warnings.
Known things that are not implemented here:
recarraysmode = c("r", "r+", "a", "w", "w-", "x")parameter value