-
Notifications
You must be signed in to change notification settings - Fork 0
get.coverage.stats.sh
Simon Crameri edited this page Apr 1, 2022
·
4 revisions
Compute mapping and coverage statistics for a batch of samples in parallel, and write the results to two large tables.
get.coverage.stats.sh -Q <positive integer> -d <directory> -s <file> -t <integer>
get.coverage.stats.R
samtools
bedtools
# Required
-Q Minimum mapping quality used when running run.bwamem.sh.
# Optional [DEFAULT]
-d [pwd] Path to directory with mapping results (directory with sample subdirectories containing mapped reads).
-s [samples.txt] File with sample names (without header or '>').
-t [2] Number of samples to process in parallel.
The output files are named such that separate results can be genereated for varying quality thresholds ${Q}, if multiple quality thresholds were used for mapping.
The script writes three files in each sample subdirectory, and collects the results of all samples in a single file.
# in each sample subdirectory
- ${in}/${name}/${name}.bwa-mem.sorted.Q${Q}.nodup.mapstats.txt Per-sample mapping statistics, extracted from `.flagstats`.
- ${in}/${name}/${name}.bwa-mem.sorted.Q${Q}.nodup.coverage.txt Per-sample coverage statistics, produced by `get.coverage.stats.R`.
- ${in}/${name}/${name}.bwa-mem.sorted.Q${Q}.nodup.covtab.txt" Tabular summary with the number of loci above coverage thresholds.
# in the output directory
- ${in}/mapping.stats.Q${Q}.txt File with all per-sample mapping statistics combined.
- ${in}/coverage.stats.Q${Q}.txt File with all per-sample coverage statistics combined.
get.coverage.stats.sh -s samples.txt -Q 20 -t 20
Simon Crameri (ETHZ) and Stefan Zoller (GDC)
## References
- GNU parallel
CaptureAl v0.1 Documentation