-
Notifications
You must be signed in to change notification settings - Fork 942
Description
Have you checked the docs?
Description of the bug
In the utils_nfcore_pipeline workflow, the completionEmail function is defined to send email to users when the pipeline run completes. However, it will not work when the pipeline runs on AWS batch using S3 as workDir.
Specifically, in line https://github.com/nf-core/modules/blob/master/subworkflows/nf-core/utils_nfcore_pipeline/main.nf#L277 , mqc_report is the a path.
When running on local or HPC, the path is the local path which is accessible by the head job.
When running on AWSbatch using S3 as the workDir, suppose the workDir is "s3://BUCKET/scratch", the path will be something like: "/BUCKET/scratch/4dmcVVKo90LpUC/12/3d1d669c37458ddf6cd8c53ebf0520/multiqc_report.html" Note that the "S3:/" prefix is gone now as that is the virtual local path for worker EC2. They are mounted as a local path, and it is accessible to worker EC2. But the head job cannot access that file with a local path.
I tried some debug script:
def mqc_report = getSingleReport(multiqc_report)
log.info("MultiQC report: ${mqc_report}")
log.info("Type of MultiQC report: ${mqc_report.getClass()}")
log.info("MultiQC report absolute path: ${mqc_report.toAbsolutePath()}")
log.info("MultiQC report URI: ${mqc_report.toUriString()}")
They gave me:
MultiQC report: /BUCKET/scratch/4dmcVVKo90LpUC/12/3d1d669c37458ddf6cd8c53ebf0520/multiqc_report.html
Type of MultiQC report: class nextflow.cloud.aws.nio.S3Path
MultiQC report absolute path: /BUCKET/scratch/4dmcVVKo90LpUC/12/3d1d669c37458ddf6cd8c53ebf0520/multiqc_report.html
MultiQC report URI: s3://BUCKET/scratch/4dmcVVKo90LpUC/12/3d1d669c37458ddf6cd8c53ebf0520/multiqc_report.html
I also checked "s3://BUCKET/scratch/4dmcVVKo90LpUC/12/3d1d669c37458ddf6cd8c53ebf0520/multiqc_report.html" is the acutal file which exists.
Then I did:
mqc_report_file = new File(mqc_report.toUriString())
log.info("MultiQC report file path: ${mqc_report_file.getAbsolutePath()}")
log.info("MultiQC report file size: ${mqc_report_file.length()} bytes")
or
mqc_report_file = new File(mqc_report)
log.info("MultiQC report file path: ${mqc_report_file.getAbsolutePath()}")
log.info("MultiQC report file size: ${mqc_report_file.length()} bytes")
It gave me the same:
MultiQC report file path: /s3:/BUCKET/scratch/4dmcVVKo90LpUC/12/3d1d669c37458ddf6cd8c53ebf0520/multiqc_report.html
MultiQC report file size: 0 bytes
This file path does not look right. It is "/s3:/". I did not change it.
The reason that I call new File first is because def mqcFileObj = new File("$mqcFile") does not work in the "sendmail_template.txt" as in the wrapper the File class is the native Java file class, which only works for local path not S3.
Anyway, the send mail function will fail silentily as in "sendmail_template.txt", mqcFile is not empty but mqcFileObj is empty in def mqcFileObj = new File("$mqcFile")
Would like to hear others' thoughts about this.
The workaround I made it work on AWS batch with S3 as workDir is to do the following.
I had to create a new temp file to make it work.
if (mqc_report != null) {
try {
// Check if we're using S3 workDir and need to download file
if (workflow.workDir.scheme == 's3' && mqc_report.toUriString().startsWith('s3://')) {
def tempFile = new File("${workflow.launchDir}/multiqc_report_${workflow.sessionId}.html")
tempFile.withWriter { w -> w << mqc_report.text }
mqc_report_file = tempFile.absolutePath
} else {
// Local path - just pass the path string (template will create File object)
mqc_report_file = mqc_report.toString()
}
} catch (Exception e) {
log.warn("Error accessing MultiQC report: ${e.message}")
}
}
Command used and terminal output
Relevant files
No response
System information
Nextflow version 25.04