-
Notifications
You must be signed in to change notification settings - Fork 942
5787 new module whatshap/stats #9490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…tput in a txt file
…modules into 5787-new-module-whatshap-stats
…modules into 5787-new-module-whatshap-stats
|
This looks overall fine. My concern is that whatshap stats has a bunch of arguments that are not clearly documented on readthedocs (https://github.com/whatshap/whatshap/blob/main/whatshap/cli/stats.py): And it looks like the current implementation uses |
…modules into 5787-new-module-whatshap-stats
Thanks for your review @nschan , I now updates my PR for the user to be able to choose which output they want to make it more flexible. |
|
Thanks, that looks very good to me. I have one additional question, probably because I have no experience with it: independent of any |
I thought about always capturing the output in the |
|
No, that is not what I meant, I only wanted to clarify if whatshap stats will always output the same format to stdout independent of the format specified. |
nschan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be a bit confusing that now the default behaviour is not producing any output, as all include_* values are defaulting to false. Since this is documented, I am okay with it, but maybe it would be better to have some default output..
fellen31
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions! Nice job!
| tuple val(meta), path("${prefix}_whap_stats.tsv"), emit: tsv, optional: true | ||
| tuple val(meta), path("${prefix}_whap_stats.gtf"), emit: gtf, optional: true | ||
| tuple val(meta), path("${prefix}_whap_stats_block.txt"), emit: block, optional: true | ||
| tuple val(meta), path("${prefix}_whap_stats.txt"), emit: txt, optional: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest giving the user of the module the opportunity to choose to include e.g. _whap_stats via the prefix, if they want to, rather than hardcoding it. Although we would need some distinction between the block (which is also tab-separated) and txt output (calling the txt log for example, which I'd argue that it is, especially when running with the --debug flag), if it's kept.
| tuple val(meta), path("${prefix}_whap_stats.tsv"), emit: tsv, optional: true | |
| tuple val(meta), path("${prefix}_whap_stats.gtf"), emit: gtf, optional: true | |
| tuple val(meta), path("${prefix}_whap_stats_block.txt"), emit: block, optional: true | |
| tuple val(meta), path("${prefix}_whap_stats.txt"), emit: txt, optional: true | |
| tuple val(meta), path("${prefix}.tsv"), emit: tsv, optional: true | |
| tuple val(meta), path("${prefix}.gtf"), emit: gtf, optional: true | |
| tuple val(meta), path("${prefix}.txt"), emit: block, optional: true | |
| tuple val(meta), path("${prefix}.log"), emit: log, optional: true |
| $output_gtf \\ | ||
| $output_block \\ | ||
| $vcf \\ | ||
| $output_txt \\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| $output_txt \\ | |
| $output_txt |
| def output_tsv = include_tsv_output ? "--tsv ${prefix}_whap_stats.tsv" : '' | ||
| def output_gtf = include_gtf_output ? "--gtf ${prefix}_whap_stats.gtf" : '' | ||
| def output_block = inlude_block_output ? "--block-list ${prefix}_whap_stats_block.txt" : '' | ||
| def output_txt = include_txt_output ? "> ${prefix}_whap_stats.txt" : '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having it like this is OK, but I would also be down to scrap the stdout output altogether. I think it's more of a log rather than an output really. No one in their right mind would choose to process that over the TSV 😅 (used by e.g. MultiQC). The printed output would still available in the stdout files.
If kept, since the tool will aways write this to stdout, regardless of whether any of the other output formats are selected, I think it would be okay to always output this log as default as well. The user doesn't have to use it if they don't want to.
| def output_tsv = include_tsv_output ? "--tsv ${prefix}_whap_stats.tsv" : '' | ||
| def output_gtf = include_gtf_output ? "--gtf ${prefix}_whap_stats.gtf" : '' | ||
| def output_block = inlude_block_output ? "--block-list ${prefix}_whap_stats_block.txt" : '' | ||
| def output_txt = include_txt_output ? "> ${prefix}_whap_stats.txt" : '' | ||
| """ | ||
| touch ${prefix}_whap_stats.tsv | ||
| touch ${prefix}_whap_stats.gtf | ||
| touch ${prefix}_whap_stats_block.txt | ||
| touch ${prefix}_whap_stats.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this would make the stub section mimic the script section correctly.
| def output_tsv = include_tsv_output ? "--tsv ${prefix}_whap_stats.tsv" : '' | |
| def output_gtf = include_gtf_output ? "--gtf ${prefix}_whap_stats.gtf" : '' | |
| def output_block = inlude_block_output ? "--block-list ${prefix}_whap_stats_block.txt" : '' | |
| def output_txt = include_txt_output ? "> ${prefix}_whap_stats.txt" : '' | |
| """ | |
| touch ${prefix}_whap_stats.tsv | |
| touch ${prefix}_whap_stats.gtf | |
| touch ${prefix}_whap_stats_block.txt | |
| touch ${prefix}_whap_stats.txt | |
| def tsv_touch_cmd = include_tsv_output ? "--tsv ${prefix}_whap_stats.tsv" : '' | |
| def gtf_touch_cmd = include_gtf_output ? "touch ${prefix}.tsv" : '' | |
| def block_touch_cmd = inlude_block_output ? "touch ${prefix}.txt" : '' | |
| def log_touch_cmd = include_txt_output ? "touch ${prefix}.log" : '' | |
| """ | |
| echo $args | |
| $tsv_touch_cmd | |
| $gtf_touch_cmd | |
| $block_touch_cmd | |
| $txt_touch_cmd |
Add module whatshap/stats.
PR checklist
Closes #5787
versions.ymlfile.labelnf-core modules test <MODULE> --profile dockernf-core modules test <MODULE> --profile singularitynf-core modules test <MODULE> --profile condanf-core subworkflows test <SUBWORKFLOW> --profile dockernf-core subworkflows test <SUBWORKFLOW> --profile singularitynf-core subworkflows test <SUBWORKFLOW> --profile conda