|
1918 | 1918 | "source": [ |
1919 | 1919 | "The `setup_anndata()` function itself is quite simple since any complexity in preprocessing is contained within the `AnnDataField` functions. By factorizing the preprocessing steps into each subclass, model developers can easily extend and reuse logic across models and fields." |
1920 | 1920 | ] |
| 1921 | + }, |
| 1922 | + { |
| 1923 | + "cell_type": "markdown", |
| 1924 | + "id": "872b95f9-f17d-49c7-956e-d7b9005a4f3a", |
| 1925 | + "metadata": {}, |
| 1926 | + "source": [ |
| 1927 | + "## Custom DataLoaders\n", |
| 1928 | + "\n", |
| 1929 | + "In SCVI, custom dataloaders allow you to create a tailored data pipeline that can handle unique formats or complex datasets not covered by the default loaders. A custom dataloader can be useful when you have a specific structure for your data or need to preprocess it in a particular way before feeding it into the model, in order to gain some advantage.\n", |
| 1930 | + "See more [here](\"https://docs.scvi-tools.org/en/stable/user_guide/use_case/custom_dataloaders.html\")\n", |
| 1931 | + "\n", |
| 1932 | + "In SCVI-tools a custom dataloader class is a LightningDataModule inherited class which should create batches of data from an external source and feed them into a scvi pytorch model during training and inference.\n", |
| 1933 | + "\n", |
| 1934 | + "Beucase it is tailored made for a specific data source, custom dataloders differ from each other. \n", |
| 1935 | + "Nevertheless, there are some common bulding blocks that are required in order to create it:\n", |
| 1936 | + "- a 'linkage' to the data source that the custom data loder need to query from.\n", |
| 1937 | + "- `batch_key` is the key for batch information. \n", |
| 1938 | + "- `labels_key` is the key for label information. \n", |
| 1939 | + "- `unlabeled_category` is the key for the unlabeled groyp information. \n", |
| 1940 | + "- `train_dataloader` a function to create a training set pytorch Dataloder\n", |
| 1941 | + "- `val_dataloader` a function to create a validation set pytorch Dataloder\n", |
| 1942 | + "- `registry` its the manual implementaion of the scvi tools registry as a dict filled with information taken from the datamodule itself.\n", |
| 1943 | + " Note that each datamodule will have its own registry implementation and also it should be extended to work with other models (currently only SCVI and SCANVI are supported, but it should be generic enough to work with any model)\n" |
| 1944 | + ] |
| 1945 | + }, |
| 1946 | + { |
| 1947 | + "cell_type": "code", |
| 1948 | + "execution_count": null, |
| 1949 | + "id": "1f31df4c-58d2-403b-8f87-d39f9e9cd5d8", |
| 1950 | + "metadata": {}, |
| 1951 | + "outputs": [], |
| 1952 | + "source": [] |
1921 | 1953 | } |
1922 | 1954 | ], |
1923 | 1955 | "metadata": { |
|
1926 | 1958 | "provenance": [] |
1927 | 1959 | }, |
1928 | 1960 | "kernelspec": { |
1929 | | - "display_name": "scvi-tools-dev", |
| 1961 | + "display_name": "Python 3 (ipykernel)", |
1930 | 1962 | "language": "python", |
1931 | 1963 | "name": "python3" |
1932 | 1964 | }, |
|
1940 | 1972 | "name": "python", |
1941 | 1973 | "nbconvert_exporter": "python", |
1942 | 1974 | "pygments_lexer": "ipython3", |
1943 | | - "version": "3.12.6" |
| 1975 | + "version": "3.12.8" |
1944 | 1976 | }, |
1945 | 1977 | "vscode": { |
1946 | 1978 | "interpreter": { |
|
0 commit comments