infercna aims to provide functions for inferring CNA values from scRNA-seq data and related queries.
infercna()to infer copy-number alterations from single-cell RNA-seq datarefCorrect()to convert relative CNA values to absolute values + computed ininfercna()if reference cells are providedcnaPlot()to plot a heatmap of CNA valuescnaScatterPlot()to visualise malignant and non-malignant cell subsetscnaCor()a parameter to identify cells with high CNAs + computed incnaScatterPlot()cnaSignal()a second parameter to identify cells with high CNAs + computed incnaScatterPlot()findMalignant()to find malignant subsets of cellsfindClones()to identify genetic subclonesfitBimodal()to fit a bimodal gaussian distribution + used infindMalignant()+ used infindClones()filterGenes()to filter genes by their genome featuressplitGenes()to split genes by their genome featuresorderGenes()to order genes by their genomic positionuseGenome()to change the default genome configured with infercnaaddGenome()to configure infercna with a new genome specified by the user
See Reference tab for a full list and documentation pages.
To install infercna:
# install.packages("devtools")
devtools::install_github("jlaffy/infercna")The methodology behind infercna has been tried and tested in several high-impact publications. It was actually in the earliest of these papers (last listed) that the idea to infer CNAs from single-cell RNA-sequencing data was first formulated.
The bare minimum for use in infercna is:
- a single-cell expression matrix of genes by cells
- not centered
- normalised for sequencing depth and gene length (e.g. one of TPM, RPKM, CPM, etc).
- optionally in log space. e.g.
log2(TPM/10 + 1) - Note: also see
infercna::TPMandinfercna::logTPM
If you would like to compute absolute (rather than relative) CNA values, you should additionally provide:
- a list of length two or more containing reference cell IDs of normal
cells. For example list(macrophages, oligodendrocytes).
- see example reference
infercna::refCells
- see example reference
Finally, if your genome is not available in the current implementation of infercna, you should additionally provide:
- a genome dataframe, containing the columns:
symbol,chromosome_name,start_position,arm.
infercna is built with two example datasets of scRNA-seq data from two
patients with Glioblastoma, infercna::bt771 and infercna::mgh125,
along with two normal reference groups, infercna::refCells. The
matrices are stored as sparse matrices and you can use
infercna::useData() to load them as normal matrices. These patients
are taken from a much larger cohort of 28 Glioblastoma samples. You can
look at the complete study
here and can download the
complete dataset via the Single Cell
Portal.
Future implementations will include:
- more default genomes to choose from
- option to correct CNA values (to absolute) when just one reference is available.
- more stuff…