Skip to content

Sendmail issue when running on AWS batch with S3 as workDir #9424

@SZhengP

Description

@SZhengP

Have you checked the docs?

Description of the bug

In the utils_nfcore_pipeline workflow, the completionEmail function is defined to send email to users when the pipeline run completes. However, it will not work when the pipeline runs on AWS batch using S3 as workDir.

Specifically, in line https://github.com/nf-core/modules/blob/master/subworkflows/nf-core/utils_nfcore_pipeline/main.nf#L277 , mqc_report is the a path.

When running on local or HPC, the path is the local path which is accessible by the head job.

When running on AWSbatch using S3 as the workDir, suppose the workDir is "s3://BUCKET/scratch", the path will be something like: "/BUCKET/scratch/4dmcVVKo90LpUC/12/3d1d669c37458ddf6cd8c53ebf0520/multiqc_report.html" Note that the "S3:/" prefix is gone now as that is the virtual local path for worker EC2. They are mounted as a local path, and it is accessible to worker EC2. But the head job cannot access that file with a local path.

I tried some debug script:

    def mqc_report = getSingleReport(multiqc_report)

    log.info("MultiQC report: ${mqc_report}")
    log.info("Type of MultiQC report: ${mqc_report.getClass()}")
    log.info("MultiQC report absolute path: ${mqc_report.toAbsolutePath()}")
    log.info("MultiQC report URI: ${mqc_report.toUriString()}")

They gave me:

MultiQC report: /BUCKET/scratch/4dmcVVKo90LpUC/12/3d1d669c37458ddf6cd8c53ebf0520/multiqc_report.html
Type of MultiQC report: class nextflow.cloud.aws.nio.S3Path
MultiQC report absolute path: /BUCKET/scratch/4dmcVVKo90LpUC/12/3d1d669c37458ddf6cd8c53ebf0520/multiqc_report.html
MultiQC report URI: s3://BUCKET/scratch/4dmcVVKo90LpUC/12/3d1d669c37458ddf6cd8c53ebf0520/multiqc_report.html

I also checked "s3://BUCKET/scratch/4dmcVVKo90LpUC/12/3d1d669c37458ddf6cd8c53ebf0520/multiqc_report.html" is the acutal file which exists.

Then I did:

            mqc_report_file = new File(mqc_report.toUriString())
            log.info("MultiQC report file path: ${mqc_report_file.getAbsolutePath()}")
            log.info("MultiQC report file size: ${mqc_report_file.length()} bytes")

or

            mqc_report_file = new File(mqc_report)
            log.info("MultiQC report file path: ${mqc_report_file.getAbsolutePath()}")
            log.info("MultiQC report file size: ${mqc_report_file.length()} bytes")

It gave me the same:

MultiQC report file path: /s3:/BUCKET/scratch/4dmcVVKo90LpUC/12/3d1d669c37458ddf6cd8c53ebf0520/multiqc_report.html
MultiQC report file size: 0 bytes

This file path does not look right. It is "/s3:/". I did not change it.

The reason that I call new File first is because def mqcFileObj = new File("$mqcFile") does not work in the "sendmail_template.txt" as in the wrapper the File class is the native Java file class, which only works for local path not S3.

Anyway, the send mail function will fail silentily as in "sendmail_template.txt", mqcFile is not empty but mqcFileObj is empty in def mqcFileObj = new File("$mqcFile")

Would like to hear others' thoughts about this.

The workaround I made it work on AWS batch with S3 as workDir is to do the following.
I had to create a new temp file to make it work.

    if (mqc_report != null) {
        try {
            // Check if we're using S3 workDir and need to download file
            if (workflow.workDir.scheme == 's3' && mqc_report.toUriString().startsWith('s3://')) {
                def tempFile = new File("${workflow.launchDir}/multiqc_report_${workflow.sessionId}.html")
                tempFile.withWriter { w -> w << mqc_report.text }
                mqc_report_file = tempFile.absolutePath
            } else {
                // Local path - just pass the path string (template will create File object)
                mqc_report_file = mqc_report.toString()
            }
        } catch (Exception e) {
            log.warn("Error accessing MultiQC report: ${e.message}")
        }
    }

Command used and terminal output

Relevant files

No response

System information

Nextflow version 25.04

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions