Skip to content

Adding CRAM support#179

Merged
s-andrews merged 12 commits intos-andrews:cramfrom
muffato:cram
Mar 19, 2026
Merged

Adding CRAM support#179
s-andrews merged 12 commits intos-andrews:cramfrom
muffato:cram

Conversation

@muffato
Copy link
Copy Markdown

@muffato muffato commented Mar 6, 2026

Fixes #54
Supersedes #176

The leg work was done by @s-andrews and @Pranav-Garg. All I've done here is an additional merge from master to make the PR clearer, and a few fixes regarding the JAR/classpath and the file extensions declared (and a couple of Java language fixes).

This fixes the bzip2 error I had flagged in #54 (comment). I've also tested all 162 CRAM files from the samtools test suite. The only failures are caused by:

  • purposefully malformed CRAM files
  • CRAM 3.1 and recent CRAM features (requires a newer version of htsjdk)

As far as I can see, upgrading htsjdk and supporting recent CRAM specs is as simple as putting a more recent htsjdk in, but the last version compatible with Java 11 is v3.0.5, from 3 years ago. Upgrading to the very latest htsjdk would mean bumping up the minimum Java version to 17. Don't know if that's something we can do, so I haven't committed that change.

Copilot AI review requested due to automatic review settings March 6, 2026 23:42
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for CRAM files as an input format to FastQC, addressing a long-standing feature request (#54). It routes .cram files through the existing BAMFile handler (which uses htsjdk), adds a --reference option to provide a FASTA reference file needed for CRAM decoding, and updates the classpath to include the commons-compress dependency required by the newer htsjdk for CRAM support. It also includes minor Java deprecation fixes in bundled Apache Commons Math source files.

Changes:

  • CRAM file format detection, routing, and output filename handling added across SequenceFactory.java, SequenceFileFilter.java, FastQCConfig.java, OfflineRunner.java, and the fastqc Perl launcher script
  • New --reference / -r option added to support providing a FASTA reference file for CRAM decoding in BAMFile.java, FastQCConfig.java, and the fastqc script
  • Classpath updated in run_fastqc.bat, fastqc, and .classpath to include commons-compress-1.26.0.jar; and Java deprecation warnings fixed in bundled MathUtils.java and ResizableDoubleArray.java

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
uk/ac/babraham/FastQC/Sequence/SequenceFactory.java Routes .cram extension and cram/cram_mapped format strings to BAMFile
uk/ac/babraham/FastQC/Sequence/BAMFile.java Passes FASTA reference to SamReaderFactory when decoding CRAM files
uk/ac/babraham/FastQC/FileFilters/SequenceFileFilter.java Adds .cram to the GUI file chooser filter
uk/ac/babraham/FastQC/FastQCConfig.java Adds reference field and validates cram/cram_mapped format strings
uk/ac/babraham/FastQC/Analysis/OfflineRunner.java Strips .cram extension when building the output HTML filename
fastqc Adds --reference CLI option, updates classpath and help text for CRAM
run_fastqc.bat Adds commons-compress-1.26.0.jar to the Windows classpath
.classpath Adds Eclipse classpath entry for commons-compress-1.26.0.jar
org/apache/commons/math3/util/ResizableDoubleArray.java Replaces deprecated new Float().hashCode() with Float.hashCode()
org/apache/commons/math3/util/MathUtils.java Replaces deprecated new Double().hashCode() with Double.hashCode()

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread uk/ac/babraham/FastQC/Analysis/OfflineRunner.java
Comment thread uk/ac/babraham/FastQC/FileFilters/SequenceFileFilter.java
Comment thread fastqc
@s-andrews s-andrews changed the base branch from master to cram March 19, 2026 11:21
@s-andrews s-andrews merged commit baf282e into s-andrews:cram Mar 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CRAM as input format?

3 participants