Skip to content

add replicates, perform QC on individual replicates before merge#138

Merged
ljwharbers merged 7 commits intodevfrom
qc_fixes
Mar 12, 2026
Merged

add replicates, perform QC on individual replicates before merge#138
ljwharbers merged 7 commits intodevfrom
qc_fixes

Conversation

@robert-a-forsyth
Copy link
Collaborator

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Copilot AI review requested due to automatic review settings March 11, 2026 14:43
@github-actions
Copy link
Contributor

github-actions bot commented Mar 11, 2026

nf-core pipelines lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 18706bf

+| ✅ 178 tests passed       |+
#| ❔  21 tests were ignored |#
!| ❗  28 tests had warnings |!
Details

❗ Test warnings:

  • pipeline_todos - TODO string in README.md: Include a figure that guides the user through the major workflow steps. Many nf-core
  • pipeline_todos - TODO string in lint_log.txt: Named file extensions MUST be emitted for ALL output channels
  • pipeline_todos - TODO string in lint_log.txt: List additional required output channels/values here
  • pipeline_todos - TODO string in lint_log.txt: Named file extensions MUST be emitted for ALL output channels
  • pipeline_todos - TODO string in lint_log.txt: List additional required output channels/values here
  • pipeline_todos - TODO string in lint_log.txt: Named file extensions MUST be emitted for ALL output channels
  • pipeline_todos - TODO string in lint_log.txt: List additional required output channels/values here
  • pipeline_todos - TODO string in lint_log.txt: Named file extensions MUST be emitted for ALL output channels
  • pipeline_todos - TODO string in lint_log.txt: List additional required output channels/values here
  • pipeline_todos - TODO string in lint_log.txt: Named file extensions MUST be emitted for ALL output channels
  • pipeline_todos - TODO string in lint_log.txt: List additional required output channels/values here
  • pipeline_todos - TODO string in lint_log.txt: Named file extensions MUST be emitted for ALL output channels
  • pipeline_todos - TODO string in lint_log.txt: List additional required output channels/values here
  • pipeline_todos - TODO string in lint_log.txt: Named file extensions MUST be emitted for ALL output channels
  • pipeline_todos - TODO string in lint_log.txt: List additional required output channels/values here
  • pipeline_todos - TODO string in lint_log.txt: Named file extensions MUST be emitted for ALL output channels
  • pipeline_todos - TODO string in lint_log.txt: List additional required output channels/values here
  • pipeline_todos - TODO string in nextflow.config: Specify your pipeline's command line flags
  • pipeline_todos - TODO string in nextflow.config: Update the field with the details of the contributors to your pipeline. New with Nextflow version 24.10.0
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
  • pipeline_todos - TODO string in nextflow.config: Specify any additional parameters here
  • pipeline_todos - TODO string in base.config: Check the defaults for all processes
  • pipeline_todos - TODO string in base.config: Customise requirements for specific processes.
  • pipeline_todos - TODO string in meta.yml: #Add a description of the module and list keywords
  • local_component_structure - tumor_only_happhase.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure
  • local_component_structure - prepare_annotation.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure
  • local_component_structure - tumor_normal_happhase.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure
  • local_component_structure - prepare_reference_files.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure

❔ Tests ignored:

  • files_exist - File is ignored: CODE_OF_CONDUCT.md
  • files_exist - File is ignored: assets/nf-core-lrsomatic_logo_light.png
  • files_exist - File is ignored: docs/images/nf-core-lrsomatic_logo_light.png
  • files_exist - File is ignored: docs/images/nf-core-lrsomatic_logo_dark.png
  • files_exist - File is ignored: .github/ISSUE_TEMPLATE/config.yml
  • files_exist - File is ignored: .github/workflows/awstest.yml
  • files_exist - File is ignored: .github/workflows/awsfulltest.yml
  • nextflow_config - Config variable ignored: manifest.name
  • nextflow_config - Config variable ignored: manifest.homePage
  • files_unchanged - File ignored due to lint config: CODE_OF_CONDUCT.md
  • files_unchanged - File ignored due to lint config: .github/CONTRIBUTING.md
  • files_unchanged - File ignored due to lint config: .github/ISSUE_TEMPLATE/bug_report.yml
  • files_unchanged - File does not exist: .github/ISSUE_TEMPLATE/config.yml
  • files_unchanged - File ignored due to lint config: .github/PULL_REQUEST_TEMPLATE.md
  • files_unchanged - File ignored due to lint config: assets/email_template.txt
  • files_unchanged - File ignored due to lint config: assets/nf-core-lrsomatic_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-lrsomatic_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-lrsomatic_logo_dark.png
  • files_unchanged - File ignored due to lint config: docs/README.md
  • actions_awstest - 'awstest.yml' workflow not found: /home/runner/work/lrsomatic/lrsomatic/.github/workflows/awstest.yml
  • schema_params - schema_params

✅ Tests passed:

Run details

  • nf-core/tools version 3.5.2
  • Run at 2026-03-12 15:41:31

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces replicate awareness to the lrsomatic Nextflow pipeline, aiming to run QC on individual replicates before merging them for downstream analysis.

Changes:

  • Extend samplesheet/schema parsing to carry tumor_replicate / normal_replicate and propagate a unified meta.replicate.
  • Run CRAMINO QC earlier (intended per-replicate) and publish CRAMINO_PRE outputs into replicate-specific directories.
  • Update nf-test snapshot outputs to reflect replicate-specific QC directories and an added multi-replicate sample case.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
workflows/lrsomatic.nf Adds replicate to meta, moves CRAMINO_PRE earlier, attempts to strip replicate before merge/concat.
subworkflows/local/utils_nfcore_lrsomatic_pipeline/main.nf Parses new replicate columns and populates meta.replicate.
conf/modules.config Updates CRAMINO_PRE publishDir to include replicate in the output path.
assets/schema_input.json Adds tumor_replicate and normal_replicate fields with defaults.
tests/default.nf.test.snap Updates expected outputs for replicate-specific QC directories and new sample outputs.
Comments suppressed due to low confidence (1)

workflows/lrsomatic.nf:188

  • After stripping replicate from meta, the channel is not regrouped, so replicate rows will continue through as separate items (and SAMTOOLS_CAT will never see bam.size() > 1). This can lead to duplicate downstream processing and incorrect tumor/normal pairing in later groupTuple(size: 2) logic. Regroup by the replicate-free meta (e.g., groupTuple() then flatten) before branching/concatenation so replicates are merged into a single BAM list/file per sample/type.
    ch_samplesheet
        .map{ meta, bam ->
            def new_meta = meta.subMap('id',
                            'paired_data',
                            'type',
                            'platform',
                            'sex',
                            'fiber',
                            'clair3_model',
                            'clairS_model',
                            'clairSTO_model',
                            'kinetics')
            return[new_meta, bam]
        }
        .set{ch_samplesheet_no_rep}


    // ch_samplesheet -> meta: [id, paired_data, platform, sex, type, fiber, basecall_model]
    //                   bam:  list of unaligned bams

    ch_split = ch_samplesheet_no_rep
        .branch { meta, bam ->
            single: bam.size() == 1
            multiple: bam.size() > 1
        }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.


downloaded_clair3_models = PREPARE_REFERENCE_FILES.out.downloaded_clair3_models

ch_samplesheet.view()
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ch_samplesheet.view() looks like a debugging statement and will spam logs / potentially expose input paths in normal runs. Consider removing it or gating behind a dedicated debug/trace parameter.

Suggested change
ch_samplesheet.view()
if (params.trace_samplesheet) {
ch_samplesheet.view()
}

Copilot uses AI. Check for mistakes.
.mix ( ch_split.single )
.set { ch_cat_ubams }

ch_cat_ubams.view()
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ch_cat_ubams.view() looks like a debugging statement and will spam logs / potentially expose input paths in normal runs. Consider removing it or gating behind a dedicated debug/trace parameter.

Suggested change
ch_cat_ubams.view()
if( params.debug ) {
ch_cat_ubams.view()
}

Copilot uses AI. Check for mistakes.
include { MINIMAP2_INDEX } from '../modules/nf-core/minimap2/index/main'
include { MINIMAP2_ALIGN } from '../modules/nf-core/minimap2/align/main'
include { CRAMINO as CRAMINO_PRE } from '../modules/local/cramino/main'
include { CRAMINO as CRAMINO_PRE_REPLICATES } from '../modules/local/cramino/main'
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRAMINO_PRE_REPLICATES is imported but never used. Please remove the unused include or switch the per-replicate QC invocation to use this alias (and keep CRAMINO_PRE for any non-replicate use) to avoid confusion and potential lint failures.

Suggested change
include { CRAMINO as CRAMINO_PRE_REPLICATES } from '../modules/local/cramino/main'

Copilot uses AI. Check for mistakes.
.map { meta, bam_tumor, bam_normal ->
[ meta, bam_tumor.flatten(), bam_normal.flatten() ]
}
.view()
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .view() call in PIPELINE_INITIALISATION appears to be left-over debugging and will print samplesheet entries (including file paths) to stdout for every run. Consider removing it or gating it behind a debug/trace parameter.

Suggested change
.view()

Copilot uses AI. Check for mistakes.
@ljwharbers ljwharbers merged commit fea2b09 into dev Mar 12, 2026
6 checks passed
@ljwharbers ljwharbers deleted the qc_fixes branch March 12, 2026 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants