Skip to content

Commit dc8801e

Browse files
committed
pytorch version
1 parent 51f9ef4 commit dc8801e

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

89 files changed

+7122
-6240
lines changed

.gitignore

Lines changed: 5 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,6 @@
1-
2-
#developing directory
3-
/models/
4-
/checkpoint
5-
/dist/
6-
/test/data/
7-
/examples/
8-
/results/
9-
/data/
10-
11-
#OS or Editor files and folders
12-
.DS_Store
13-
Thumbs.db
14-
.ipynb_checkpoints/
15-
.directory
16-
/.idea/
17-
18-
# Python / Byte-compiled / optimized / DLL
1+
datasets/
2+
.idea/
193
__pycache__/
20-
*.py[cod]
21-
*.so
22-
.cache
23-
*.h5ad
24-
#others
25-
*.pdf
26-
*.zip
27-
#testing modules
28-
.pytest_cache/
4+
docs/_build
5+
own_tests/
6+
*.egg-info

.readthedocs.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
version: 2
44

55
python:
6-
version: 3.8
6+
version: 3.6
77
install:
88
- requirements: docs/requirements.txt
99
- method: setuptools
@@ -17,4 +17,4 @@ sphinx:
1717

1818
formats:
1919
- epub
20-
- pdf
20+
- pdf

.travis.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,13 @@ language: python
22
dist: xenial
33
cache: pip
44
python:
5+
- "3.6"
6+
- "3.7"
57
- "3.8"
68

79
install:
810
- pip install -r requirements.txt
911
- python setup.py install
1012

1113
script:
12-
- PYTHONPATH=. pytest
14+
- PYTHONPATH=. pytest

README.rst

Lines changed: 46 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,57 @@
1-
|PyPI| |travis| |Docs| |PyPIDownloads|
1+
|PyPI| |PyPIDownloads| |Docs| |travis|
22

3-
scArches - single-cell architecture surgery
3+
scArches (PyTorch) - single-cell architecture surgery
44
=========================================================================
55
.. raw:: html
66

77
<img src="https://user-images.githubusercontent.com/33202701/89729020-15f7c200-da32-11ea-989b-1b9a3283f642.png" width="900px" align="center">
88

9-
scArches is a package to integrate newly produced single-cell datasets into integrated reference atlases. Our method can facilitate large collaborative projects with decentralise training and integration of multiple datasets by different groups. scArches is compatible with `scanpy <https://scanpy.readthedocs.io/en/stable/>`_, and hosts efficient implementations of all conditional generative models for single-cell data.
9+
This is a Pytorch version of scArches which can be found `here <https://github.com/theislab/scArches/>`_. scArches is a package to integrate newly produced single-cell datasets into integrated reference atlases. Our method can facilitate large collaborative projects with decentralise training and integration of multiple datasets by different groups. scArches is compatible with `scanpy <https://scanpy.readthedocs.io/en/stable/>`_, and hosts efficient implementations of all conditional generative models for single-cell data.
10+
11+
1012

1113
What can you do with scArches?
1214
-------------------------------
13-
- Integrate many single-cell datasets and share the trained model and the data (if possible).
14-
- Download a pre-trained model for your atlas of interest, update it with new datasets and share with your collaborators.
15-
- Construct a customized reference by downloading a reference atlas, add a few pre-trained adaptors (datasets) and project your own data in to this customized reference atlas.
16-
- Project and integrate query datasets on the top of a reference and use latent representation for downstream tasks, e.g.: diff testing, clustering.
15+
- Construct single or multi-modal (CITE-seq) reference atlases and share the trained model and the data (if possible).
16+
- Download a pre-trained model for your atlas of interest, update it wih new datasets and share with your collaborators.
17+
- Project and integrate query datasets on the top of a reference and use latent representation for downstream tasks, e.g.:diff testing, clustering, classification
18+
19+
20+
What are different models?
21+
---------------
22+
scArches is itself and algorithm to map to project query on the top of reference datasets and is applicable
23+
to different models. Here we provide a short explanation and hints when to use which model. Our models are divided into
24+
three categories:
25+
26+
27+
What are different models?
28+
---------------
29+
scArches is itself and algorithm to map to project query on the top of reference datasets and is applicable
30+
to different models. Here we provide a short explanation and hints when to use which model. Our models are divided into
31+
three categories:
32+
33+
Unsupervised
34+
This class of algortihms need no `cell type` labels, meaning that you can creat a reference and project a query without having access to cell type labeles.
35+
We implemented two algorithms:
36+
37+
- **scVI** (`Lopez et al.,2018 <https://www.nature.com/articles/s41592-018-0229-2>`_.): Requires access to raw counts values for data integration and assumes
38+
count distribution on the data (NB, ZINB, Poission).
39+
40+
- **trVAE** (`Lotfollahi et al.,2019 <https://arxiv.org/abs/1910.01791>`_.): It supports both normalized log tranformed or count data as input and applies additional MMD loss to have better mearging in the latent space.
41+
42+
Supervised and Semi-supervised
43+
This class of algorithmes assume the user has access to `cell type` labels when creating the reference data and usaully perfomr better integration
44+
compared to. unsupervised methods. However, the query data still can be unlabaled. In addition to integration , you can classify your query cells using
45+
these methods.
46+
47+
- **scANVI** (`Xu et al.,2019 <https://www.biorxiv.org/content/10.1101/532895v1>`_.): It neeeds cell type labels for reference data. Your query data can be either unlabeled or labeled. In case of unlabeled query data you can use this method to also classify your query cells using reference labels.
48+
49+
Multi-modal
50+
These algorithms can be used to contstruct multi-modal references atlas and map query data from either modalities on the top of the reference.
51+
52+
- **totalVI** (`Gayoso al.,2019 <https://www.biorxiv.org/content/10.1101/532895v1>`_.): This model can be used to build multi-modal CITE-seq reference atalses.
53+
Query datasets can be either from sc-RNAseq or CITE-seq. In addition to integrating query with reference one can use this model to impute the Proteins
54+
in the query datasets.
1755

1856
Usage and installation
1957
-------------------------------
@@ -22,7 +60,7 @@ See `here <https://scarches.readthedocs.io/>`_ for documentation and tutorials.
2260
Support and contribute
2361
-------------------------------
2462
If you have a question or new architecture or a model that could be integrated into our pipeline, you can
25-
post an `issue <https://github.com/theislab/scarches/issues/new>`__. Our package supports tf/keras now but pytorch version will be added very soon.
63+
post an `issue <https://github.com/theislab/scarches/issues/new>`__ or reach us by `email <mailto:[email protected],[email protected],[email protected]>`_.
2664

2765

2866
Reference

__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
from . import scarches

docs/about.rst

Lines changed: 44 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,43 +1,66 @@
11
|PyPI| |travis| |Docs|
22

3-
scArches - single-cell architecture surgery
3+
scArches (PyTorch) - single-cell architecture surgery
44
=========================================================================
55
.. raw:: html
66

77
<img src="https://user-images.githubusercontent.com/33202701/89729020-15f7c200-da32-11ea-989b-1b9a3283f642.png" width="700px" align="center">
88

9+
scArches is a package to integrate newly produced single-cell datasets into integrated references atlases. Our method can facilitate large collaborative projects with decentralised training and integration of multiple datasets by different groups. scArches is compatible with `scanpy <https://scanpy.readthedocs.io/en/stable/>`_. and hosts efficient implementations of all conditional generative models for single-cell data.
910

1011

11-
12-
scArches is a package to integrate newly produced single-cell datasets into integrated references atlases. Our method can facilitate large collaborative projects with decentralise training and integration of multiple datasets by different groups. scArches is compatible with `scanpy <https://scanpy.readthedocs.io/en/stable/>`_. and hosts efficient implementations of all conditional generative models for single-cell data.
13-
1412
What can you do with scArches?
15-
--------------------------------
16-
- Integrate many single-cell datasets and share the trained model and the data (if possible).
13+
-------------------------------
14+
- Construct single or multi-modal (CITE-seq) reference atlases and share the trained model and the data (if possible).
1715
- Download a pre-trained model for your atlas of interest, update it wih new datasets and share with your collaborators.
18-
- Construct a customized reference by downloading a reference atlas, add a few pre-trained adaptors (datasets) and project your own data in to this customized reference atlas.
19-
- Project and integrate query datasets on the top of a reference and use latent representation for downstream tasks, e.g.:diff testing, clustering.
16+
- Project and integrate query datasets on the top of a reference and use latent representation for downstream tasks, e.g.:diff testing, clustering, classification
2017

21-
Where to start?
22-
--------------------------------
2318

19+
What are different models?
20+
---------------
21+
scArches is itself and algorithm to map to project query on the top of reference datasets and is applicable
22+
to different models. Here we provide a short explanation and hints when to use which model. Our models are divided into
23+
three categories:
2424

25-
To get a sense of how the model works please go through `this <https://scarches.readthedocs.io/en/latest/pancreas_pipeline.html>`_ example.
26-
For examples on how to use or construct and share pre-trained models check examples.
2725

28-
What is an adaptor?
29-
--------------------------------
30-
.. raw:: html
26+
What are different models?
27+
---------------
28+
scArches is itself and algorithm to map to project query on the top of reference datasets and is applicable
29+
to different models. Here we provide a short explanation and hints when to use which model. Our models are divided into
30+
three categories:
31+
32+
Unsupervised
33+
This class of algortihms need no `cell type` labels, meaning that you can creat a reference and project a query without having access to cell type labeles.
34+
We implemented two algorithms:
35+
36+
- **scVI** (`Lopez et al.,2018 <https://www.nature.com/articles/s41592-018-0229-2>`_.): Requires access to raw counts values for data integration and assumes
37+
count distribution on the data (NB, ZINB, Poission).
3138

32-
<img src="https://user-images.githubusercontent.com/33202701/89730296-bdc6bd00-da3d-11ea-9012-410e22fa200a.png" width="200px" align="right">
39+
- **trVAE** (`Lotfollahi et al.,2019 <https://arxiv.org/abs/1910.01791>`_.): It supports both normalized log tranformed or count data as input and applies additional MMD loss to have better mearging in the latent space.
3340

34-
In scArche, each query datasets is added to the reference model by training a set of weights called `adaptor`.
35-
Each `adaptor` is a sharable object. This will enable users to download a reference model, customise
36-
that reference model with a set of `adaptors` (datasets) and finally add user data as a new
37-
`adaptor` and also share this adaptor for others.
41+
Supervised and Semi-supervised
42+
This class of algorithmes assume the user has access to `cell type` labels when creating the reference data and usaully perfomr better integration
43+
compared to. unsupervised methods. However, the query data still can be unlabaled. In addition to integration , you can classify your query cells using
44+
these methods.
3845

46+
- **scANVI** (`Xu et al.,2019 <https://www.biorxiv.org/content/10.1101/532895v1>`_.): It neeeds cell type labels for reference data. Your query data can be either unlabeled or labeled. In case of unlabeled query data you can use this method to also classify your query cells using reference labels.
3947

48+
Multi-modal
49+
These algorithms can be used to contstruct multi-modal references atlas and map query data from either modalities on the top of the reference.
4050

51+
- **totalVI** (`Gayoso al.,2019 <https://www.biorxiv.org/content/10.1101/532895v1>`_.): This model can be used to build multi-modal CITE-seq reference atalses.
52+
Query datasets can be either from sc-RNAseq or CITE-seq. In addition to integrating query with reference one can use this model to impute the Proteins
53+
in the query datasets.
54+
55+
56+
Where to start?
57+
---------------
58+
To get a sense of how the model works please go through `this <https://scarches.readthedocs.io/en/latest/pancreas_pipeline.html>`__ tutorial.
59+
To find out how to construct and share or use pre-trained models example sections. Check `this <https://scarches.readthedocs.io/en/latest/zenodo_intestine.html>`__ example to learn how to start with a raw data and pre-process data for the model.
60+
61+
Reference
62+
-------------------------------
63+
If scArches is useful in your research, please consider to cite the `preprint <https://www.biorxiv.org/content/10.1101/2020.07.16.205997v1/>`_.
4164

4265

4366
.. |PyPI| image:: https://img.shields.io/pypi/v/scarches.svg
@@ -51,3 +74,4 @@ that reference model with a set of `adaptors` (datasets) and finally add user da
5174

5275
.. |travis| image:: https://travis-ci.com/theislab/scarches.svg?branch=master
5376
:target: https://travis-ci.com/theislab/scarches
77+
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
Data Processing
22
===============
33

4-
.. automodule:: scarches.data
4+
.. automodule:: scarches.dataset
55
:members:
66
:undoc-members:
77
:show-inheritance:

docs/api/datasets.rst

Lines changed: 0 additions & 7 deletions
This file was deleted.

docs/api/index.rst

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -9,17 +9,15 @@ The API reference contains detailed descriptions of the different end-user class
99
This API reference only contains end-user documentation.
1010
If you are looking to hack away at scArches' internals, you will find more detailed comments in the source code.
1111

12-
Import scArches as::
12+
Import scarches as::
1313

1414
import scarches as sca
1515

16-
After reading the data (``sca.data.read``), you can normalize your data with our ``sca.data.normalize_hvg`` function.
17-
Then, you can instantiate one of the implemented models from ``sca.models`` module (currently we support ``scArches``,
18-
``scArches``, ``scArchesNB``, and ``scArchesZINB``) and train it on your dataset. Finally, after training a model on your task, You can
19-
share your trained model via ``sca.zenodo`` functions. Multiple examples are provided in `here`.
16+
After reading the data (``sca.data.read``), you can you can instantiate one of the implemented models from ``sca.models`` module (currently we support ``trVAE``,
17+
``scVI``, ``scANVI``, and ``TotalVI``) and train it on your dataset.
2018

2119
.. toctree::
2220
:glob:
2321
:maxdepth: 2
2422

25-
*
23+
*

docs/api/models.rst

Lines changed: 30 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,39 @@
11
Models
22
======
33

4-
* `scArches`_
4+
* `trVAE`_
5+
* `scVI`_
6+
* `scANVI`_
7+
* `TotalVI`_
58

6-
scArches
9+
trVAE
10+
-----
11+
12+
.. autoclass:: scarches.models.TRVAE
13+
:members:
14+
:undoc-members:
15+
:show-inheritance:
16+
17+
scVI
18+
----
19+
20+
.. autoclass:: scarches.models.SCVI
21+
:members:
22+
:undoc-members:
23+
:show-inheritance:
24+
25+
scANVI
26+
--------
27+
28+
.. autoclass:: scarches.models.SCANVI
29+
:members:
30+
:undoc-members:
31+
:show-inheritance:
32+
33+
TotalVI
734
--------
835

9-
.. autoclass:: scarches.models.scArches
36+
.. autoclass:: scarches.models.TOTALVI
1037
:members:
1138
:undoc-members:
1239
:show-inheritance:

0 commit comments

Comments
 (0)