You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
scArches is a package to integrate newly produced single-cell datasets into integrated reference atlases. Our method can facilitate large collaborative projects with decentralise training and integration of multiple datasets by different groups. scArches is compatible with `scanpy <https://scanpy.readthedocs.io/en/stable/>`_, and hosts efficient implementations of all conditional generative models for single-cell data.
9
+
This is a Pytorch version of scArches which can be found `here <https://github.com/theislab/scArches/>`_. scArches is a package to integrate newly produced single-cell datasets into integrated reference atlases. Our method can facilitate large collaborative projects with decentralise training and integration of multiple datasets by different groups. scArches is compatible with `scanpy <https://scanpy.readthedocs.io/en/stable/>`_, and hosts efficient implementations of all conditional generative models for single-cell data.
10
+
11
+
10
12
11
13
What can you do with scArches?
12
14
-------------------------------
13
-
- Integrate many single-cell datasets and share the trained model and the data (if possible).
14
-
- Download a pre-trained model for your atlas of interest, update it with new datasets and share with your collaborators.
15
-
- Construct a customized reference by downloading a reference atlas, add a few pre-trained adaptors (datasets) and project your own data in to this customized reference atlas.
16
-
- Project and integrate query datasets on the top of a reference and use latent representation for downstream tasks, e.g.: diff testing, clustering.
15
+
- Construct single or multi-modal (CITE-seq) reference atlases and share the trained model and the data (if possible).
16
+
- Download a pre-trained model for your atlas of interest, update it wih new datasets and share with your collaborators.
17
+
- Project and integrate query datasets on the top of a reference and use latent representation for downstream tasks, e.g.:diff testing, clustering, classification
18
+
19
+
20
+
What are different models?
21
+
---------------
22
+
scArches is itself and algorithm to map to project query on the top of reference datasets and is applicable
23
+
to different models. Here we provide a short explanation and hints when to use which model. Our models are divided into
24
+
three categories:
25
+
26
+
27
+
What are different models?
28
+
---------------
29
+
scArches is itself and algorithm to map to project query on the top of reference datasets and is applicable
30
+
to different models. Here we provide a short explanation and hints when to use which model. Our models are divided into
31
+
three categories:
32
+
33
+
Unsupervised
34
+
This class of algortihms need no `cell type` labels, meaning that you can creat a reference and project a query without having access to cell type labeles.
35
+
We implemented two algorithms:
36
+
37
+
- **scVI** (`Lopez et al.,2018 <https://www.nature.com/articles/s41592-018-0229-2>`_.): Requires access to raw counts values for data integration and assumes
38
+
count distribution on the data (NB, ZINB, Poission).
39
+
40
+
- **trVAE** (`Lotfollahi et al.,2019 <https://arxiv.org/abs/1910.01791>`_.): It supports both normalized log tranformed or count data as input and applies additional MMD loss to have better mearging in the latent space.
41
+
42
+
Supervised and Semi-supervised
43
+
This class of algorithmes assume the user has access to `cell type` labels when creating the reference data and usaully perfomr better integration
44
+
compared to. unsupervised methods. However, the query data still can be unlabaled. In addition to integration , you can classify your query cells using
45
+
these methods.
46
+
47
+
- **scANVI** (`Xu et al.,2019 <https://www.biorxiv.org/content/10.1101/532895v1>`_.): It neeeds cell type labels for reference data. Your query data can be either unlabeled or labeled. In case of unlabeled query data you can use this method to also classify your query cells using reference labels.
48
+
49
+
Multi-modal
50
+
These algorithms can be used to contstruct multi-modal references atlas and map query data from either modalities on the top of the reference.
51
+
52
+
- **totalVI** (`Gayoso al.,2019 <https://www.biorxiv.org/content/10.1101/532895v1>`_.): This model can be used to build multi-modal CITE-seq reference atalses.
53
+
Query datasets can be either from sc-RNAseq or CITE-seq. In addition to integrating query with reference one can use this model to impute the Proteins
54
+
in the query datasets.
17
55
18
56
Usage and installation
19
57
-------------------------------
@@ -22,7 +60,7 @@ See `here <https://scarches.readthedocs.io/>`_ for documentation and tutorials.
22
60
Support and contribute
23
61
-------------------------------
24
62
If you have a question or new architecture or a model that could be integrated into our pipeline, you can
25
-
post an `issue <https://github.com/theislab/scarches/issues/new>`__. Our package supports tf/keras now but pytorch version will be added very soon.
scArches is a package to integrate newly produced single-cell datasets into integrated references atlases. Our method can facilitate large collaborative projects with decentralised training and integration of multiple datasets by different groups. scArches is compatible with `scanpy <https://scanpy.readthedocs.io/en/stable/>`_. and hosts efficient implementations of all conditional generative models for single-cell data.
9
10
10
11
11
-
12
-
scArches is a package to integrate newly produced single-cell datasets into integrated references atlases. Our method can facilitate large collaborative projects with decentralise training and integration of multiple datasets by different groups. scArches is compatible with `scanpy <https://scanpy.readthedocs.io/en/stable/>`_. and hosts efficient implementations of all conditional generative models for single-cell data.
13
-
14
12
What can you do with scArches?
15
-
--------------------------------
16
-
- Integrate many single-cell datasets and share the trained model and the data (if possible).
13
+
-------------------------------
14
+
- Construct single or multi-modal (CITE-seq) reference atlases and share the trained model and the data (if possible).
17
15
- Download a pre-trained model for your atlas of interest, update it wih new datasets and share with your collaborators.
18
-
- Construct a customized reference by downloading a reference atlas, add a few pre-trained adaptors (datasets) and project your own data in to this customized reference atlas.
19
-
- Project and integrate query datasets on the top of a reference and use latent representation for downstream tasks, e.g.:diff testing, clustering.
16
+
- Project and integrate query datasets on the top of a reference and use latent representation for downstream tasks, e.g.:diff testing, clustering, classification
20
17
21
-
Where to start?
22
-
--------------------------------
23
18
19
+
What are different models?
20
+
---------------
21
+
scArches is itself and algorithm to map to project query on the top of reference datasets and is applicable
22
+
to different models. Here we provide a short explanation and hints when to use which model. Our models are divided into
23
+
three categories:
24
24
25
-
To get a sense of how the model works please go through `this <https://scarches.readthedocs.io/en/latest/pancreas_pipeline.html>`_ example.
26
-
For examples on how to use or construct and share pre-trained models check examples.
27
25
28
-
What is an adaptor?
29
-
--------------------------------
30
-
.. raw:: html
26
+
What are different models?
27
+
---------------
28
+
scArches is itself and algorithm to map to project query on the top of reference datasets and is applicable
29
+
to different models. Here we provide a short explanation and hints when to use which model. Our models are divided into
30
+
three categories:
31
+
32
+
Unsupervised
33
+
This class of algortihms need no `cell type` labels, meaning that you can creat a reference and project a query without having access to cell type labeles.
34
+
We implemented two algorithms:
35
+
36
+
- **scVI** (`Lopez et al.,2018 <https://www.nature.com/articles/s41592-018-0229-2>`_.): Requires access to raw counts values for data integration and assumes
37
+
count distribution on the data (NB, ZINB, Poission).
- **trVAE** (`Lotfollahi et al.,2019 <https://arxiv.org/abs/1910.01791>`_.): It supports both normalized log tranformed or count data as input and applies additional MMD loss to have better mearging in the latent space.
33
40
34
-
In scArche, each query datasets is added to the reference model by training a set of weights called `adaptor`.
35
-
Each `adaptor` is a sharable object. This will enable users to download a reference model, customise
36
-
that reference model with a set of `adaptors` (datasets) and finally add user data as a new
37
-
`adaptor` and also share this adaptor for others.
41
+
Supervised and Semi-supervised
42
+
This class of algorithmes assume the user has access to `cell type` labels when creating the reference data and usaully perfomr better integration
43
+
compared to. unsupervised methods. However, the query data still can be unlabaled. In addition to integration , you can classify your query cells using
44
+
these methods.
38
45
46
+
- **scANVI** (`Xu et al.,2019 <https://www.biorxiv.org/content/10.1101/532895v1>`_.): It neeeds cell type labels for reference data. Your query data can be either unlabeled or labeled. In case of unlabeled query data you can use this method to also classify your query cells using reference labels.
39
47
48
+
Multi-modal
49
+
These algorithms can be used to contstruct multi-modal references atlas and map query data from either modalities on the top of the reference.
40
50
51
+
- **totalVI** (`Gayoso al.,2019 <https://www.biorxiv.org/content/10.1101/532895v1>`_.): This model can be used to build multi-modal CITE-seq reference atalses.
52
+
Query datasets can be either from sc-RNAseq or CITE-seq. In addition to integrating query with reference one can use this model to impute the Proteins
53
+
in the query datasets.
54
+
55
+
56
+
Where to start?
57
+
---------------
58
+
To get a sense of how the model works please go through `this <https://scarches.readthedocs.io/en/latest/pancreas_pipeline.html>`__ tutorial.
59
+
To find out how to construct and share or use pre-trained models example sections. Check `this <https://scarches.readthedocs.io/en/latest/zenodo_intestine.html>`__ example to learn how to start with a raw data and pre-process data for the model.
60
+
61
+
Reference
62
+
-------------------------------
63
+
If scArches is useful in your research, please consider to cite the `preprint <https://www.biorxiv.org/content/10.1101/2020.07.16.205997v1/>`_.
Copy file name to clipboardExpand all lines: docs/api/index.rst
+4-6Lines changed: 4 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,17 +9,15 @@ The API reference contains detailed descriptions of the different end-user class
9
9
This API reference only contains end-user documentation.
10
10
If you are looking to hack away at scArches' internals, you will find more detailed comments in the source code.
11
11
12
-
Import scArches as::
12
+
Import scarches as::
13
13
14
14
import scarches as sca
15
15
16
-
After reading the data (``sca.data.read``), you can normalize your data with our ``sca.data.normalize_hvg`` function.
17
-
Then, you can instantiate one of the implemented models from ``sca.models`` module (currently we support ``scArches``,
18
-
``scArches``, ``scArchesNB``, and ``scArchesZINB``) and train it on your dataset. Finally, after training a model on your task, You can
19
-
share your trained model via ``sca.zenodo`` functions. Multiple examples are provided in `here`.
16
+
After reading the data (``sca.data.read``), you can you can instantiate one of the implemented models from ``sca.models`` module (currently we support ``trVAE``,
17
+
``scVI``, ``scANVI``, and ``TotalVI``) and train it on your dataset.
0 commit comments