Skip to content

WARP Workflow "out of memory" issue on Terra #1677

@yxhan

Description

@yxhan

I’d like to learn your experience with [WholeGenomeGermlineSingleSample v3.3.4] on Terra (https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fbroadinstitute%2Fwarp%2FWholeGenomeGermlineSingleSample&data=05%7C02%7Cyixing.han%40nih.gov%7Cce867daae0f74de018f208ddf61d217a%7C14b77578977342d58507251ca2dc2b06%7C0%7C0%7C638937327154399259%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=zuRyjNb80RrtYI2mD2XtJRTRX3%2B5BCOJR74VMqWyf%2Fo%3D&reserved=0) workflow. I consistently encounter “out of memory” errors at the MarkDuplicates stage.

From my understanding, the memory_multiplier parameter controls this step. I have experimented with several values:

34, 68, 70, 80 → all returned out-of-memory errors.

100, 250, 300 → returned the error “Invalid value for field ‘resource.properties.machineType’”, which I believe indicates that GCP rejected the request due to excessive resource allocation.

Since I am working with large uBAM files (400 samples, total size is about 30 TB), I am unsure how best to configure these parameters to complete the workflow successfully. I have attached my current inputs.json file below for your reference.

Please advise on how to properly set the parameters (particularly memory and disk sizing) so that the workflow can run successfully on large inputs. I’d also be happy to provide any additional details that would help in troubleshooting.

I greatly appreciate any insight you can share.

input.json:

{“WholeGenomeGermlineSingleSample.CollectRawWgsMetrics.read_length”:“${151}”,“WholeGenomeGermlineSingleSample.UnmappedBamToAlignedBam.ApplyBQSR.gatk_docker”:“${}”,“WholeGenomeGermlineSingleSample.BamToGvcf.make_bamout”:“${false}”,“WholeGenomeGermlineSingleSample.fingerprint_genotypes_file”:“gs://dsde-data-na12878-public/NA12878.hg38.reference.fingerprint.vcf”,“WholeGenomeGermlineSingleSample.CollectRawWgsMetrics.memory_multiplier”:“${4}”,“WholeGenomeGermlineSingleSample.references”:“${{“contamination_sites_ud”:“gs://gcp-public-data–broad-references/hg38/v0/Homo_sapiens_assembly38.contam.UD”,“contamination_sites_bed”:“gs://gcp-public-data–broad-references/hg38/v0/Homo_sapiens_assembly38.contam.bed”,“contamination_sites_mu”:“gs://gcp-public-data–broad-references/hg38/v0/Homo_sapiens_assembly38.contam.mu”,“calling_interval_list”:“gs://gcp-public-data–broad-references/hg38/v0/wgs_calling_regions.hg38.interval_list”,“reference_fasta”:{“ref_dict”:“gs://gcp-public-data–broad-references/hg38/v0/Homo_sapiens_assembly38.dict”,“ref_fasta”:“gs://gcp-public-data–broad-references/hg38/v0/Homo_sapiens_assembly38.fasta”,“ref_fasta_index”:“gs://gcp-public-data–broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.fai”,“ref_alt”:“gs://gcp-public-data–broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.64.alt”,“ref_sa”:“gs://gcp-public-data–broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.64.sa”,“ref_amb”:“gs://gcp-public-data–broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.64.amb”,“ref_bwt”:“gs://gcp-public-data–broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.64.bwt”,“ref_ann”:“gs://gcp-public-data–broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.64.ann”,“ref_pac”:“gs://gcp-public-data–broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.64.pac”},“known_indels_sites_vcfs”:[“gs://gcp-public-data–broad-references/hg38/v0/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz”,“gs://gcp-public-data–broad-references/hg38/v0/Homo_sapiens_assembly38.known_indels.vcf.gz”],“known_indels_sites_indices”:[“gs://gcp-public-data–broad-references/hg38/v0/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz.tbi”,“gs://gcp-public-data–broad-references/hg38/v0/Homo_sapiens_assembly38.known_indels.vcf.gz.tbi”],“dbsnp_vcf”:“gs://gcp-public-data–broad-references/hg38/v0/Homo_sapiens_assembly38.dbsnp138.vcf”,“dbsnp_vcf_index”:“gs://gcp-public-data–broad-references/hg38/v0/Homo_sapiens_assembly38.dbsnp138.vcf.idx”,“evaluation_interval_list”:“gs://gcp-public-data–broad-references/hg38/v0/wgs_evaluation_regions.hg38.interval_list”,“haplotype_database_file”:“gs://gcp-public-data–broad-references/hg38/v0/Homo_sapiens_assembly38.haplotype_database.txt”}}”,“WholeGenomeGermlineSingleSample.sample_and_unmapped_bams”:“${{ “sample_name”: this.sample_name_id, “base_file_name”: this.base_file_name, “flowcell_unmapped_bams”: this.flowcell_unmapped_bams, “final_gvcf_base_name”: this.final_gvcf_base_name, “unmapped_bam_suffix”: “.bam” }}”,“WholeGenomeGermlineSingleSample.UnmappedBamToAlignedBam.SortSampleBam.memory_multiplier”:“${34}”,“WholeGenomeGermlineSingleSample.UnmappedBamToAlignedBam.GatherBamFiles.additional_disk”:“${1000}”,“WholeGenomeGermlineSingleSample.UnmappedBamToAlignedBam.MarkDuplicates.read_name_regex”:“${null}”,“WholeGenomeGermlineSingleSample.cloud_provider”:“gcp”,“WholeGenomeGermlineSingleSample.BamToGvcf.SortBamout.additional_disk”:“${1000}”,“WholeGenomeGermlineSingleSample.CollectRawWgsMetrics.additional_disk”:“${1000}”,“WholeGenomeGermlineSingleSample.UnmappedBamToAlignedBam.ApplyBQSR.memory_multiplier”:“${8}”,“WholeGenomeGermlineSingleSample.UnmappedBamToAlignedBam.BaseRecalibrator.gatk_docker”:“${}”,“WholeGenomeGermlineSingleSample.BamToGvcf.HaplotypeCallerGATK4.memory_multiplier”:“${8}”,“WholeGenomeGermlineSingleSample.UnmappedBamToAlignedBam.ApplyBQSR.additional_disk”:“${1000}”,“WholeGenomeGermlineSingleSample.BamToGvcf.make_gvcf”:“${true}”,“WholeGenomeGermlineSingleSample.wgs_coverage_interval_list”:“gs://gcp-public-data–broad-references/hg38/v0/wgs_coverage_regions.hg38.interval_list”,“WholeGenomeGermlineSingleSample.BamToGvcf.SortBamout.memory_multiplier”:“${20}”,“WholeGenomeGermlineSingleSample.UnmappedBamToAlignedBam.MarkDuplicates.additional_disk”:“${1500}”,“WholeGenomeGermlineSingleSample.papi_settings”:“${{“preemptible_tries”:3,“agg_preemptible_tries”:3}}”,“WholeGenomeGermlineSingleSample.BamToCram.ValidateCram.memory_multiplier”:“${4}”,“WholeGenomeGermlineSingleSample.scatter_settings”:“${{“haplotype_scatter_count”:50,“break_bands_at_multiples_of”:1000000}}”,“WholeGenomeGermlineSingleSample.UnmappedBamToAlignedBam.MarkDuplicates.memory_multiplier”:“${80}”,“WholeGenomeGermlineSingleSample.UnmappedBamToAlignedBam.GatherBamFiles.memory_multiplier”:“${4}”,“WholeGenomeGermlineSingleSample.UnmappedBamToAlignedBam.GatherBqsrReports.gatk_docker”:“${}”,“WholeGenomeGermlineSingleSample.AggregatedBamQC.CheckFingerprintTask.memory_size”:“${1000}”}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions