Skip to content

trim.alignment.ends.parallel.sh

Simon Crameri edited this page Apr 3, 2022 · 2 revisions

Description

Trim alignment ends based on alignment completeness and nucleotide diversity, for a batch of alignments in parallel.

Usage

trim.alignment.ends.parallel.sh -s <file> -d <directory> -c <numeric fraction> -n <numeric fraction> \
                                -t <positive integer> -v

Dependencies

# R package:
ape

Arguments

# Required
-s            File with sample names in FIRST column. Header and tab-separation expected. Any additional columns
              are ignored.
-d            Path to directory with raw alignments. This directory should contain a FASTA file for each target
              region, each with aligned contigs of multiple samples.

# Optional [DEFAULT]
-c    [0.5]   Completeness parameter. Alignments are trimmed at both ends until an alignment site has nucleotides in
              at least the specified fraction of aligned sequences. 
              Both thresholds (-c and -n) need to be reached for trimming to stop.
-n   [0.25]   Maximum nucleotide diversity parameter (i.e., the sum of the number of base differences between
              sequence pairs, divided by the number of comparisons).
              Alignments are trimmed at both ends until an alignment site shows a nucleotide diversity of 0.25 or lower.
              Both thresholds (-c and -n) need to be reached for trimming to stop.
-m    ['-']   Gap character. This character is interpreted as missing data or a gap when using the -c and -n filters.
-v  [false]   FLAG, if turned on, the alignment trimming will be visualized as a PDF
              (recommended for few alignments only).
-w     [15]   Width of output PDF file.
-h      [7]   Height of output PDF file.
-t      [4]   Number of parallel threads.

Details

Value

This script creates an output directory of the form <inputdirectory>.c${c}.d{$n}, where ${c} is the completeness parameter and ${n} is the nucleotide diversity parameter.

Examples

# no visualization
trim.alignment.ends.parallel.sh -s mapfile.txt -d mafft.63.2396 -c 0.5 -n 0.25 -t 20

# with visualization
trim.alignment.ends.parallel.sh -s mapfile.txt -d mafft.63.2396 -c 0.5 -n 0.25 -t 20 -v

Clone this wiki locally