Provided cores differ to what's passed in the CLI

Hi,

first, thank you for providing this profile. It's extremely useful.

I'm having a problem to launch multiple jobs at the same time. For example, I want to launch 5 jobs at the same time, each with 64 cores.

If I run `snakemake --cores 64`, I find that the jobs get launched sequentially rather than in parallel. I understand that this is because I requested a "maximum" of 64 cores, and thus if a job takes that up, I can only run one at a time.

Now, I wrote a function that is passed to the rules `threads` directive which multiples the `workflow.cores` by, say, `0.2`. So I can pass `snakemake --cores 320` and each rule will be allocated 64 cores. However, I am finding that somehow this is getting "squared". What happens is:
- the Snakemake STDOUT (what is shown in the screen) shows the correct number of threads:
```
rule map_reads:
    input: output/mapping/H/catalogue.mmi, output/qc/merged/H_S003_R1.fq.gz, output/qc/merged/H_S003_R2.fq.gz
    output: output/mapping/bam/H/H_S003.map.bam
    log: output/logs/mapping/map_reads/H-H_S003.log
    jobid: 247
    benchmark: output/benchmarks/mapping/map_reads/H-H_S003.txt
    reason: Missing output files: output/mapping/bam/H/H_S003.map.bam
    wildcards: binning_group=H, sample=H_S003
    threads: 64
    resources: tmpdir=/tmp, mem_mb=149952


    minimap2 -t 64 > output/mapping/bam/H/H_S003.map.bam

Submitted job 247 with external jobid '35894421'.
```

That looks fine. I want this rule to be launched with 64 cores, and when I do this, 5 instances of the rule get launched at the same time.

When I open the job's SLURM log, however, I find that this value of 64 is passed as the "Provided cores" to the job, and thus is multiplied again by `0.2`.

Contents of `slurm-35894421.out`:
```
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 64
Rules claiming more threads will be scaled down.
Select jobs to execute...

rule map_reads:
    input: output/mapping/H/catalogue.mmi, output/qc/merged/H_S003_R1.fq.gz, output/qc/merged/H_S003_R2.fq.gz
    output: output/mapping/bam/H/H_S003.map.bam
    log: output/logs/mapping/map_reads/H-H_S003.log
    jobid: 247
    benchmark: output/benchmarks/mapping/map_reads/H-H_S003.txt
    reason: Missing output files: output/mapping/bam/H/H_S003.map.bam
    wildcards: binning_group=H, sample=H_S003
    threads: 13
    resources: tmpdir=/tmp, mem_mb=149952


    minimap2 -t 13 > output/mapping/bam/H/H_S003.map.bam
```

Even worse, my job is allocating 64 cores, but only using 13 (`64 * 0.2`, rounded). It's really weird to me that the Snakemake output shows the "correct" value, but the SLURM log shows the "real" value that was used, i.e. why do they differ?

I am trying to understand what am I doing wrong. Setting a breakpoint on my function used to get the number of threads, the `workflow.cores` variable is always what I pass to the command line (320), never what shows in the SLURM log. 

I tried add a `nodes: 5` or a `jobs: 5` keys to the profile `config.yaml` but it doesn't do any good. Is there anything I can modify in the profile to make sure that I can launch as many parallel jobs as I can? 

Please let me know what other information I can provide. Thank you very much.

Best,
V

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provided cores differ to what's passed in the CLI #97

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Provided cores differ to what's passed in the CLI #97

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions