PeakForge is a Python-native toolkit for differential analysis of ATAC-seq, CUT&Tag, and ChIP-seq peak data. It supports both replicate-aware comparisons and data-limited 1 vs 1 analyses, starting from BAM files or pre-called peaks and producing consensus peaks, count matrices, differential results, and standard diagnostic plots.
This directory is a GitHub-ready distribution of the current PeakForge codebase. Large BAM example files are intentionally excluded so the repository stays lightweight and uploadable. Example scripts and the Colab notebook download the public ENCODE MYC ChIP-seq BAMs together with the matched input-control BAMs on demand.
For most users, the fastest way to try PeakForge is the Colab notebook:
https://colab.research.google.com/github/cheneyyu/PeakForge/blob/main/colab/PeakForge_Colab_Quickstart.ipynb
The notebook reproduces the main user-facing workflow in a lightweight environment:
- installs PeakForge with
samtoolsandmacs3 - verifies the CLI
- downloads six ENCODE MYC ChIP-seq BAMs plus two matched input-control BAMs
- runs the standard input-aware
1 vs 1and2 vs 2ChIP-seq analyses - reproduces the overlap-based ROC comparison used in the manuscript
Colab is suitable for demos, tutorials, and small analyses. The setup cell usually takes about 3-8 min. For longer runs or larger BAM collections, use a local environment instead.
For a full local setup, clone the repository and run the installation commands together:
git clone https://github.com/cheneyyu/PeakForge.git
cd PeakForge
curl -LsSf https://astral.sh/uv/install.sh | sh # only if uv is not installed yet
sudo apt-get update -y
sudo apt-get install -y samtools
uv sync --extra macs3
uv run peakforge --help
samtools --version
macs3 --versionPeakForge requires samtools plus a working macs2 or macs3 on PATH, and it prefers macs3 when both are available. uv sync --extra macs3 installs the Python package together with a Python-managed macs3, which is the recommended path for most users.
If you already manage bioinformatics tools through Conda or a system package manager, that is also fine as long as samtools and macs2 or macs3 run successfully from the shell. Optional downstream tools such as HOMER are only needed for motif analysis outside the core pipeline.
chipdiff.py,io_utils.py,motif_ranking.py,peak_shape.pypyproject.tomlforuv-based environment managementexample/scripts for:- ENCODE
2 vs 2replicate-aware benchmark - ENCODE
1 vs 1no-replicate benchmark - optional
3-foldheld-out validation - peak-shape demo
- ENCODE
uv run peakforge --helpDRY_RUN=0 bash example/download_encode.sh
bash example/run_pipeline.shOutputs will be written to:
example/results/2v2/
DRY_RUN=0 bash example/download_encode.sh
bash example/run_example_1v1.shOutputs will be written to:
example/results_1v1/
Download the extra third replicate pair as well:
DRY_RUN=0 INCLUDE_THIRD_REPLICATES=1 bash example/download_encode.sh
bash example/run_lopo_validation.shOutputs will be written to:
example/results_3v3/
uv run peakforge tsvmode samples.tsv --output-dir resultsuv run peakforge runmode \
--condition-a K562 \
--a-bams path/to/K562_rep1.bam path/to/K562_rep2.bam \
--a-controls path/to/K562_input.bam \
--condition-b HepG2 \
--b-bams path/to/HepG2_rep1.bam path/to/HepG2_rep2.bam \
--b-controls path/to/HepG2_input.bam \
--output-dir resultsFor ATAC-seq or CUT&Tag, omit the control arguments.
uv run peakforge makesheet \
--condition-a K562 \
--a-bams path/to/K562_rep1.bam path/to/K562_rep2.bam \
--a-controls path/to/K562_input.bam \
--condition-b HepG2 \
--b-bams path/to/HepG2_rep1.bam path/to/HepG2_rep2.bam \
--b-controls path/to/HepG2_input.bam \
--output samples.tsvuv run peakforge peakshape --help.
├── colab/
├── chipdiff.py
├── io_utils.py
├── motif_ranking.py
├── peak_shape.py
├── pyproject.toml
└── example/
├── README.md
├── download_encode.sh
├── run_pipeline.sh
├── run_example_1v1.sh
├── run_example2.sh
├── run_lopo_validation.sh
└── peak_shape/
- The example BAM files are not committed here because they exceed normal GitHub-friendly repository size.
example/download_encode.shfetches the public ENCODE MYC ChIP-seq BAMs together with the matched K562 and HepG2 input-control BAMs required by the standard ChIP-seq examples.1 vs 1mode is supported and useful for ranking and exploratory follow-up, but replicate-supported analysis remains the primary validation setting.
PeakForge is distributed under the license provided in LICENSE.