generated from openproblems-bio/task_template
-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
help wantedExtra attention is neededExtra attention is needed
Description
metrics/cms
is a computationally demanding metric which requires parallelization.
As of now (PR #79) I am using the library BiocParallel
which allows me to specify the following (line 41 script.R
):
cms<- cms(
...
BPPARAM = MulticoreParam(workers = 8)
Although, this is an arbitrary number of workers hardcoded by me.
I believe there could be a better strategy to dynamically set this value when running the full pipeline on the cloud.
I was thinking of something like
cores_avail <- parallel::detectCores() - 1 #leaving one core free
cores_to_use <- min(trsh, cores_avail)
cms<- cms(
...
BPPARAM = MulticoreParam(workers = cores_to_use)
However, I do not know the effect of trying to use all available cores (except one) when multiple nextflow workflows are running, so we might need a maximum threshold trsh
of cores to use (?).
Any input is appreciated!
Metadata
Metadata
Labels
help wantedExtra attention is neededExtra attention is needed