-
Notifications
You must be signed in to change notification settings - Fork 0
extract.readpairs.sh
Simon Crameri edited this page Apr 1, 2022
·
2 revisions
For each region, extract read paires from .fastq.gz files separately for each target region, based on whether at least one of the pairs mapped. Uses samtools (Li et al. 2009) and UNIX mkfifo.
extract.readpairs.sh -s <sample file> -l <locus file> -d <directory> -m <directory> -Q <integer> -t <integer>
extract-reads-from-fastq.pl
# Required
-s sample file
-l locus file
-d absolute path to folder with quality-filtered reads
-m absolute path to folder with mapping dirs
# Optional [DEFAULT]
-o [seq-extracted] output directory (created if inexistent)
-Q [10] minimum mapping quality, as used for mapping using run.bwamem.sh
-b [see details] regex-path to BAM file. Use SAMPLE as wildcard.
-t [2] number of threads used
It is highly recommended to run this script on a local scratch, due to the large number of files written.
-b <SAMPLE> can be part of the string and will be replaced by the actual sample using regex,
DEFAULT: <${mapdir}/SAMPLE/SAMPLE.bwa-mem.sorted.Q10.nodup.bam>."
An output subdirectory is created for each sample, with extracted reads in .fastq files.
extract.readpairs.sh -s samples.txt -l loci.txt -d NovaSeq-run1_trimmed -m NovaSeq-run1_mapped \
-b NovaSeq-run1_mapped/SAMPLE/SAMPLE.bwa-mem.sorted.Q10.nodup.bam -Q 10 -t 20
Simon Crameri (ETHZ) and Stefan Zoller (GDC)
- Li, H., B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, R. Durbin, and G. P. D. P. Subgroup. 2009. The Sequence Alignment/Map format and SAMtools.
CaptureAl v0.1 Documentation