The scNOMe-seq (single-cell Nucleosome, Methylation and Expression) analysis pipeline consists of two main modules for detecting nucleosome-depleted regions (NDRs) from single-cell bisulfite sequencing data.
source: https://www.illumina.com/science/sequencing-method-explorer/kits-and-arrays/nome-seq.html- Software: conda (see installing Miniconda3)
- Dataset: genome assembly and annotation from UCSC
Note: refer to config/config.yaml for details
git clone https://github.com/sherryxuePKU/scNOMe-seq.git
cd scNOMe-seq
conda install -c bioconda snakemake
conda env create -n scNOMeSeq -f base.yaml
conda env create -n py2 -f py2.yaml
scNOMe_xxh/
├── config/
│ ├── batches.xls # Pseudobulks to run
│ ├── cells.xls # Single cells to run
│ └── config.yaml
├── envs/ # config for conda environment
├── resources/ # setting for HPC
├── workflow/
│ ├── S01_cell_preprocess.smk # Single-cell preprocessing module
│ ├── S02_NDR_analysis.smk # NDR detection module
│ ├── scripts/
│ └── rules/
└── run_pipeline.sh
Step 1: Single-Cell Preprocessing (S01_cell_preprocess.smk)
Setting: Update the config/cells.xls according to your needs (one cell for each row).
Purpose: Process raw data of individual single cells through trimming, aligment, methylation extraction and quality control
Components:
- Trimming: Remove adapters and low-quality bases using Trim Galore
- Alignment: Map reads to reference genome using Bismark
- Methylation Extraction: Extract CpG and GpC methylation using methylpy
- Statistics: Generate alignment and methylation statistics
# run locally
snakemake -s workflow/S01_cell_preprocess.smk
# run in SLURM HPC
bash run_pipeline.sh workflow/S01_cell_preprocess.smk
Step 2: Quality Control Assessment
Purpose: Filter cells based on bisulfite conversion efficiency and in vitro methylation efficiency
Quality Metrics:
- CT Conversion Rate: Unmethylated WCG ratio of spike-in (such as lambda DNA)
- In vitro M.CviPI Conversion Rate: Methylated GCH ratio of spike-in
Setting: Update the config/batches.xls according to your needs (cells passing quality control only).
Step 3: Pseudobulk NDR Detection (S02_NDR_analysis.smk)
Purpose: Combine high-quality cells into pseudobulk and detect nucleosome-depleted regions
Components:
- Cell Aggregation: Merge cells passing QC into pseudobulk
- Methylation Integration: Extract CpG and GpC methylation
- NDR Calling: Identify regions with low GpC methylation (indicating nucleosome depletion)
# run locally
snakemake -s workflow/S02_NDR_analysis.smk
# run in SLURM HPC
bash run_pipeline.sh workflow/S02_NDR_analysis.smk
NOMe-seq: Kelly TK, Liu Y, Lay FD, Liang G, Berman BP, Jones PA. Genome-wide mapping of nucleosome positioning and DNA methylation within individual DNA molecules. Genome Res. 2012;22(12):2497-2506. doi:10.1101/gr.143008.112
scCOOL-seq: Li L, Guo F, Gao Y, et al. Single-cell multi-omics sequencing of human early embryos. Nat Cell Biol. 2018;20(7):847-858. doi:10.1038/s41556-018-0123-2
