filter_fastqs

A Snakemake pipeline that filters paired-end FASTQ files to retain only reads mapping to specified genomic regions.

Overview

Given aligned BAM files and a BED file defining regions of interest, the pipeline:

Sorts and indexes the BAMs
Extracts reads overlapping the target regions (via samtools view -L)
Filters the original FASTQ files to keep only those read IDs
Compresses the output FASTQs with pigz

Dependencies

Snakemake
samtools
pigz
Python 3 (standard library only)

Configuration

Edit the top of filter_fastqs.smk to set your paths:

aligned_bams_folder = "/path/to/aligned_bams"   # STAR-style: <folder>/<sample>/Aligned.out.bam
fastq_folder        = "/path/to/fastqs"          # paired-end: <sample>_1.fq.gz, <sample>_2.fq.gz
bed_file            = "/path/to/regions.bed"     # regions to retain
output_folder       = "/path/to/output"          # created automatically

Samples are discovered automatically from the FASTQ filenames.

Usage

snakemake -s filter_fastqs.smk --cores <N>

Output

<output_folder>/
  sorted_bams/         # coordinate-sorted BAMs and indexes
  filtered_bams/       # SAM files containing only reads in target regions
  <sample>_1.fq.gz     # filtered paired-end FASTQs
  <sample>_2.fq.gz

Helper script

scripts/get_regions.py extracts specific gene regions from a GTF into a BED file. Edit the gtf_file, output_bed_file, and the feature filter string inside the script, then run it directly with Python to generate the BED file needed for the pipeline.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
scripts		scripts
.gitignore		.gitignore
README.md		README.md
filter_fastqs.smk		filter_fastqs.smk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

filter_fastqs

Overview

Dependencies

Configuration

Usage

Output

Helper script

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

filter_fastqs

Overview

Dependencies

Configuration

Usage

Output

Helper script

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages