FASTA-tools

This package contains Perl programs/scripts that perform frequently needed operations on FASTA format files. Such as adjusting the line length to a uniform length, reverse complementing sequences, identifying entries with identical sequences, etc.

The executable files are located in the bin folder.

Programs:

fasta_reversecomplement

A tool to reverse complement the entries of FASTA format files.

fasta_length

A tool to get sequence length information for FASTA files.

fasta_grep

A tool to filter entries from a multi FASTA file based on entry names using regular expression(s). (Replaces fasta_extract and fasta_remove)

fasta_sort

A tool to sort entries from a multi FASTA file based on entry names using regular expression(s).

fasta_similarity

A tool to measure sequence similarity of aligned sequences in multi FASTA format files.

fasta_variability

A tool to measure sequence variability of aligned sequences in multi FASTA format files.

fasta_sub

A tool to extract a part of the sequences from FASTA files.

fasta_shift

A tool to shift circular FASTA sequences using a reference FASTA file or a position.

fasta_find

A tool to reporting exact sequence matches of entries in a reference FASTA file in a target FASTA file.

Example:

fasta_find gene.fas chr.fas

fasta_unique

A tool to remove duplicate sequence from FASTA format files and print the groups to STDERR.

Example:

fasta_unique input.fas >unique.fas 2>unique.tab

fasta_deunique

A tool to format FASTA file to replace names from duplicate removal by fasta_unique. Using the produced fasta file and the names table.

Example:

 fasta_deunique -i unique.fas -tab unique.tab >deunique.fas

fasta_pretty

A tool to format FASTA files to uniform column width (60).

Synopsis

fasta_pretty [OPTIONS] [FILE]...

Options

-h | --help

Print the help message; ignore other arguments.

Input

STDIN and/or FASTA files. The extention of the files is irrelevant.

Output

The output is FASTA format with 60 line length for the sequence. The program prints to STDOUT. This can be captured in a file by using the > or >> operator.

Examples

Format a single file (input.fas) and save it to a file (output.fas).

fasta_pretty input.fas >output.fas
cat input.fas | fasta_pretty >output.fas
cat input.fas | fasta_pretty - >output.fas

Format and concatenate three FASTA files from the current directory (input1.fas, input2.fas and input3.fas) and save it to a file (output.fas).

fasta_pretty input1.fas input2.fas input3.fas >output.fas
fasta_pretty input*.fas >output.fas
cat input2.fas | fasta_pretty input1.fas - input3.fas >output.fas

fasta_dealign

A tool to format FASTA files to remove gap character states and format to uniform column width (60).

Synopsis

fasta_dealign [OPTIONS] [FILE]...

Options

-h | --help

Print the help message; ignore other arguments.

Input

STDIN and/or FASTA files. The extention of the files is irrelevant.

Output

The output is FASTA format with 60 line length for the sequence. The program prints to STDOUT. This can be captured in a file by using the > or >> operator.

Examples

Format a single file (input.fas) and save it to a file (output.fas).

fasta_dealign input.fas >output.fas
cat input.fas | fasta_dealign >output.fas
cat input.fas | fasta_dealign - >output.fas

Format and concatenate three FASTA files from the current directory (input1.fas, input2.fas and input3.fas) and save it to a file (output.fas).

fasta_dealign input1.fas input2.fas input3.fas >output.fas
fasta_dealign input*.fas >output.fas
cat input2.fas | fasta_dealign input1.fas - input3.fas >output.fas

fasta_assembly_statistics

A tool to calulate assembly statistics for FASTA files.

It calculates the following statistics:

number of contigs
total size (bp)
N50 (bp)
L50: smallest number of contigs whose length sum produces N50
mean contig size (bp)
longest contig (bp)
third quartile (bp)
median (bp)
first quartile (bp)
shortest contig (bp)
number of Ns
number of gaps (/N+/): number of N-stretches in the sequences
number of other IUPACs: IUPAC bases are nucleotide ambiguity codes (YRWSKMDVHB)

fasta_display_alignment

A tool to display sequence alignments in either pairwise or a multiple alignment fashion.

Usage:
        fasta_display_alignment [-h | --help] [-w=<int> | --width=<int>] [-p | --pairwise] [FASTA file | -]

Description:
        A tool to display sequence alignments in either pairwise or a multiple alignment fashion.

Options:
        -h | --help
                Print the help message; ignore other arguments.
        -w=<int> | --width=<int>
                Set the width of the sequence that will be displayed per line to <int>.
        -p | --pairwise
                Display alignments in pairs similarly to an exonerate output.
                At the start the sequence IDs are printed and some alignment statistics.
        -b | --block
                Display alignment in 'block format'. Consensus positions are shown with a '*'.
                At the start the sequence IDs are printed and at the end some alignment statistics.
                Each alignment chunck starts by the range displayed and ends with a line with position info.
                This is the default option.

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
bin		bin
conda		conda
dev		dev
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
_config.yml		_config.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FASTA-tools

Programs:

fasta_reversecomplement

fasta_length

fasta_grep

fasta_sort

fasta_similarity

fasta_variability

fasta_sub

fasta_shift

fasta_find

fasta_unique

fasta_deunique

fasta_pretty

Synopsis

Options

Input

Output

Examples

fasta_dealign

Synopsis

Options

Input

Output

Examples

fasta_assembly_statistics

fasta_display_alignment

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

b-brankovics/fasta_tools

Folders and files

Latest commit

History

Repository files navigation

FASTA-tools

Programs:

fasta_pretty

Synopsis

Options

Input

Output

Examples

fasta_dealign

Synopsis

Options

Input

Output

Examples

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages