combine.contigs.parallel.sh

Description

Combine non-overlapping contigs of the same sample and locus combination, based on exonerate alignment statistics.

Usage

combine.contigs.parallel.sh -s <file> -d <directory> -e <directory> -a <positive integer> \
                            -c <positive numeric> -t <positive integer>

Dependencies

combine.contigs.R

Arguments

# Required
-s          Path to .txt file with sample names (without header or '>').
-d          Path to directory with assembly results, containing a subdirectory for each sample.
-e          Path to directory with exonerate results, containing a subdirectory for each sample.

# Optional [DEFAULT]
-a  [80]    Minimum target alignment length (bp).
-c   [2]    Minimum normalized alignment score (i.e., raw score divided by target alignment length).
-t   [4]    Number of samples processed in parallel.

Details

This is a wrapper script around combine.contigs.R, which is applied for multiple samples in parallel. That internal script reads three key files for each sample and locus combination:

the assembled contigs (consensus_contigs.fasta)
the best-matching contig (SAMPLE.LOCUS.bestScore.fasta)
the exonerate alingment statistics (SAMPLE.LOCUS.exonerate) with the alignment sugar, which is required to combine non-overlapping contigs.

The paths to these files should exist if the previous CaptureAl pipeline steps were carried out for all indicated samples, and are therefore set internally:

- cpath   Path to assembled contigs. 
          [DEFAULT: "${d}/SAMPLE.dipspades/extracted_reads_SAMPLE.fastq.LOCUS.ids.spades/dipspades/consensus_contigs.fasta"]
- fpath   Path to best-scoring contig.
          [DEFAULT: "${e}/SAMPLE/SAMPLE.LOCUS.bestScore.fasta"]
- epath   Path to exonerate alignment statistics, expects a file ending in `.exonerate`
          [DEFAULT: "${e}/SAMLPE/SAMPLE.LOCUS.exonerate"]

It uses the alignment statistics and the supplied contigs and alignment quality filters to combine non-overlapping contigs that likely represent fragments of the same locus.

Value

A FASTA file with the combined contig, which replaces the existing FASTA file with the single best-matching contig.

The replaced *.bestScore.fasta files are overwritten and will be used for downstream analyses (alignment).

Examples

combine.contigs.parallel.sh -s samples.txt -d NovaSeq-run1_assembly -e NovaSeq-run1_exonerate \
                            -a 80 -c 2 -t 20

CaptureAl v0.1 Documentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

combine.contigs.parallel.sh

Description

Usage

Dependencies

Arguments

Details

Value

Examples

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally