aaKomp

Description

Assess draft genome completeness using a fast, alignment-free, k-mer hash-based approach (aaKomp). This tool uses amino acid k-mers and a multi-index Bloom filter (miBf) to estimate the completeness of genome assemblies.

Credits

Concept: Johnathan Wong and Rene L. Warren
Design and Implementation: Johnathan Wong

Installing from Conda (Preferred method)

Under construction

Installing from Source

Clone from GitHub

git clone https://github.com/bcgsc/aakomp.git
cd aakomp
meson --prefix /path/to/install build
cd build
ninja install

Dependencies

Installing Dependencies with Conda

We recommend creating a fresh conda environment:

conda create --name aakomp
conda activate aakomp
conda install -c conda-forge -c bioconda --file requirements.txt

Running aaKomp

You can run aaKomp either directly or using the driver script run-aakomp.

Driver Script: `run-aakomp`

The run-aakomp driver automates:

Downloading BUSCO lineages
Building a miBf if missing using make_mibf with BUSCO lineages or provided references
Running aakomp
Visualizing with aakomp_plot.R

Demo Example

Here are two example usages of run-aakomp. In both cases, the --db-dir flag controls where the miBf (multi-index Bloom filter) is stored and looked up.

# Option 1: Run aaKomp using a provided reference file
run-aakomp --db-dir ./ \
  --reference reference.faa \
  --input input.fa \
  -t 4 \
  -o output_ref
  # --visualise optional argument to visualise the cumulative distribution function

# Option 2: Run aaKomp using a lineage name (e.g., "eukaryota")
# The lineage's HMMs will be downloaded and consensus sequences will be extracted to generate a reference
run-aakomp --db-dir ./ \
  --lineage eukaryota \
  --input input.fa \
  -t 4 \
  -o output_eukaryota

Note:
If the required miBF already exists in the specified --db-dir, it will be reused. Otherwise, run-aakomp will create one using either the provided --reference FASTA or a reference derived from the downloaded lineage.

Command-line Options

run-aakomp options:

Option	Description
`--help-aakomp`	Show help message for the `aakomp` binary and exit
`--help-mibf`	Show help message for the `make_mibf` binary and exit
`-i`, `--input`	Input genome file in FASTA format
`-o`, `--output`	Output prefix (default: `_`)
`-r`, `--reference`	Amino acid reference file (e.g., orthologous protein set)
`-t`, `--threads`	Number of threads to use (default: 48)
`-v`, `--verbose`	Enable verbose output
`--debug`	Enable debug mode for internal troubleshooting
`-H`, `--hash`	Number of hash functions used in miBF (default: 9)
`-k`, `--kmer`	Amino acid k-mer size (default: 9)
`-l`, `--lower-bound`	Minimum occupancy threshold for valid hits (default: 0.7)
`--rescue-kmer`	Number of consecutive k-mers to initiate a new seed (default: 4)
`--max-offset`	Maximum offset allowed when extending a seed during chaining (default: 2)
`--lineage`	Name of BUSCO lineage to auto-download and use as reference
`--db-dir`	Directory for or to store miBf database files (default: `./`)
`--dry-run`	Print commands that would be executed, but don’t run them
`--track-time`	Record and report runtime statistics for each major step
`--odb-version`	BUSCO ortholog database version (default: `12`)
`--list-lineages`	List all available BUSCO lineages and exit
`--visualise`	Visualise the cumulative distribution function
`--version`	Print version of aaKomp

License

Licensed under the GNU General Public License v3. See LICENSE or http://www.gnu.org/licenses/.

For commercial licensing inquiries, contact:
Patrick Rebstein – [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 120 Commits
argparse		argparse
test		test
LICENSE		LICENSE
README.md		README.md
aakomp.cpp		aakomp.cpp
aakomp_plot.R		aakomp_plot.R
azure-pipelines.yml		azure-pipelines.yml
make_mibf.cpp		make_mibf.cpp
meson.build		meson.build
requirements.txt		requirements.txt
run-aakomp		run-aakomp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

aaKomp

Description

Credits

Installing from Conda (Preferred method)

Installing from Source

Clone from GitHub

Dependencies

Installing Dependencies with Conda

Running aaKomp

Driver Script: `run-aakomp`

Demo Example

Command-line Options

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

bcgsc/aakomp

Folders and files

Latest commit

History

Repository files navigation

aaKomp

Description

Credits

Installing from Conda (Preferred method)

Installing from Source

Clone from GitHub

Dependencies

Installing Dependencies with Conda

Running aaKomp

Driver Script: run-aakomp

Demo Example

Command-line Options

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Driver Script: `run-aakomp`

Packages