Skip to content

Conversation

@dnil
Copy link
Member

@dnil dnil commented Dec 16, 2022

This PR adds | fixes:

  • adds an option to the case load to specify which individual to load for multi individual vcfs

How to prepare for test:

  • ssh to ...
  • Install on stage:
    bash servers/resources/SERVER.scilifelab.se/update-[THIS_TOOL]-stage.sh [THIS-BRANCH-NAME]

How to test:

Expected outcome:

  • [ ]

Review:

  • Code approved by
  • Tests executed by
  • "Merge and deploy" approved by

This version is a:

  • MAJOR - when you make incompatible API changes
  • MINOR - when you add functionality in a backwards compatible manner
  • PATCH - when you make backwards compatible bug fixes or documentation/instructions

@dnil
Copy link
Member Author

dnil commented Dec 16, 2022

Hmm, not quite. Errors are out, but it appears only SVs loaded ok. Could be the qual + GT call often 0/0 for somatic, but will test some more...

[daniel.nilsson@hasta:~] [S_base] 5s $ loqusdb-somatic load --case-id grandmarmot --variant-file /home/proj/stage/housekeeper-bundles/grandmarmot/2022-12-05/SNV.somatic.grandmarmot.tnscope.vcf.gz --sv-variants /home/proj/stage/housekeeper-bundles/grandmarmot/2022-12-05/SV.somatic.grandmarmot.svdb.vcf.gz --check-profile /home/proj/stage/housekeeper-bundles/grandmarmot/2022-12-05/SNV.germline.tumor.dnascope.vcf.gz --gq-treshold 10 --hard-threshold 0.95 --soft-threshold 0.9 --select_individual TUMOR
2022-12-16 15:43:56 hasta.scilifelab.se loqusdb.commands.cli[165904] INFO Running loqusdb version 2.6.9
2022-12-16 15:43:56 hasta.scilifelab.se mongo_adapter.client[165904] INFO Connecting to uri:mongodb://loqusdb-stage:******@cg-mongo-stage.scilifelab.se:27030
2022-12-16 15:43:56 hasta.scilifelab.se mongo_adapter.client[165904] INFO Connection established
2022-12-16 15:43:56 hasta.scilifelab.se mongo_adapter.adapter[165904] INFO Use database loqusdb-somatic-stage
2022-12-16 15:43:56 hasta.scilifelab.se loqusdb.utils.vcf[165904] INFO Check if vcf is on correct format...
[W::hts_idx_load3] The index file is older than the data file: /home/proj/stage/housekeeper-bundles/grandmarmot/2022-12-05/SNV.somatic.grandmarmot.tnscope.vcf.gz.tbi
2022-12-16 15:43:59 hasta.scilifelab.se loqusdb.utils.vcf[165904] INFO Vcf file /home/proj/stage/housekeeper-bundles/grandmarmot/2022-12-05/SNV.somatic.grandmarmot.tnscope.vcf.gz looks fine
2022-12-16 15:43:59 hasta.scilifelab.se loqusdb.utils.vcf[165904] INFO Nr of variants in vcf: 223798
2022-12-16 15:43:59 hasta.scilifelab.se loqusdb.utils.vcf[165904] INFO Type of variants in vcf: snv
2022-12-16 15:43:59 hasta.scilifelab.se loqusdb.utils.vcf[165904] INFO Check if vcf is on correct format...
[W::vcf_parse] FILTER 'GERMLINE' is not defined in the header
2022-12-16 15:44:00 hasta.scilifelab.se loqusdb.utils.vcf[165904] INFO Vcf file /home/proj/stage/housekeeper-bundles/grandmarmot/2022-12-05/SV.somatic.grandmarmot.svdb.vcf.gz looks fine
2022-12-16 15:44:00 hasta.scilifelab.se loqusdb.utils.vcf[165904] INFO Nr of variants in vcf: 17580
2022-12-16 15:44:00 hasta.scilifelab.se loqusdb.utils.vcf[165904] INFO Type of variants in vcf: sv
[W::hts_idx_load3] The index file is older than the data file: /home/proj/stage/housekeeper-bundles/grandmarmot/2022-12-05/SNV.somatic.grandmarmot.tnscope.vcf.gz.tbi
[W::hts_idx_load3] The index file is older than the data file: /home/proj/stage/housekeeper-bundles/grandmarmot/2022-12-05/SNV.somatic.grandmarmot.tnscope.vcf.gz.tbi
Inserting variants  [####################################]  100%          2022-12-16 15:44:10 hasta.scilifelab.se loqusdb.utils.load[165904] INFO Inserted 0 variants of type snv
Inserting variants  [#-----------------------------------]    3%  00:00:44[W::vcf_parse] FILTER 'GERMLINE' is not defined in the header
Inserting variants  [####################################]  100%          2022-12-16 15:45:02 hasta.scilifelab.se loqusdb.utils.load[165904] INFO Inserted 17580 variants of type sv

@dnil
Copy link
Member Author

dnil commented Dec 16, 2022

Yeah, it was the genotype quality:

[daniel.nilsson@hasta:~] [S_base] 5s $ loqusdb-somatic load --case-id grandmarmot --variant-file /home/proj/stage/housekeeper-bundles/grandmarmot/2022-12-05/SNV.somatic.grandmarmot.tnscope.vcf.gz --sv-variants /home/proj/stage/housekeeper-bundles/grandmarmot/2022-12-05/SV.somatic.grandmarmot.svdb.vcf.gz --check-profile /home/proj/stage/housekeepermarmot/2022-12-05/SNV.germline.tumor.dnascope.vcf.gz --gq-treshold 0 --hard-threshold 0.95 --soft-threshold 0.9 --select_individual TUMOR

2022-12-16 15:56:43 hasta.scilifelab.se loqusdb.commands.cli[175675] INFO Running loqusdb version 2.6.9
2022-12-16 15:56:43 hasta.scilifelab.se mongo_adapter.client[175675] INFO Connecting to uri:mongodb://loqusdb-stage:******@cg-mongo-stage.scilifelab.se:27030
2022-12-16 15:56:43 hasta.scilifelab.se mongo_adapter.client[175675] INFO Connection established
2022-12-16 15:56:43 hasta.scilifelab.se mongo_adapter.adapter[175675] INFO Use database loqusdb-somatic-stage
2022-12-16 15:56:43 hasta.scilifelab.se loqusdb.utils.vcf[175675] INFO Check if vcf is on correct format...
[W::hts_idx_load3] The index file is older than the data file: /home/proj/stage/housekeeper-bundles/grandmarmot/2022-12-05/SNV.somatic.grandmarmot.tnscope.vcf.gz.tbi
2022-12-16 15:56:47 hasta.scilifelab.se loqusdb.utils.vcf[175675] INFO Vcf file /home/proj/stage/housekeeper-bundles/grandmarmot/2022-12-05/SNV.somatic.grandmarmot.tnscope.vcf.gz looks fine
2022-12-16 15:56:47 hasta.scilifelab.se loqusdb.utils.vcf[175675] INFO Nr of variants in vcf: 223798
2022-12-16 15:56:47 hasta.scilifelab.se loqusdb.utils.vcf[175675] INFO Type of variants in vcf: snv
2022-12-16 15:56:47 hasta.scilifelab.se loqusdb.utils.vcf[175675] INFO Check if vcf is on correct format...
[W::vcf_parse] FILTER 'GERMLINE' is not defined in the header
2022-12-16 15:56:47 hasta.scilifelab.se loqusdb.utils.vcf[175675] INFO Vcf file /home/proj/stage/housekeeper-bundles/grandmarmot/2022-12-05/SV.somatic.grandmarmot.svdb.vcf.gz looks fine
2022-12-16 15:56:47 hasta.scilifelab.se loqusdb.utils.vcf[175675] INFO Nr of variants in vcf: 17580
2022-12-16 15:56:47 hasta.scilifelab.se loqusdb.utils.vcf[175675] INFO Type of variants in vcf: sv
[W::hts_idx_load3] The index file is older than the data file: /home/proj/stage/housekeeper-bundles/grandmarmot/2022-12-05/SNV.somatic.grandmarmot.tnscope.vcf.gz.tbi
[W::hts_idx_load3] The index file is older than the data file: /home/proj/stage/housekeeper-bundles/grandmarmot/2022-12-05/SNV.somatic.grandmarmot.tnscope.vcf.gz.tbi
Inserting variants  [####################################]  100%          2022-12-16 15:57:21 hasta.scilifelab.se loqusdb.utils.load[175675] INFO Inserted 223798 variants of type snv
Inserting variants  [#-----------------------------------]    3%  00:00:43[W::vcf_parse] FILTER 'GERMLINE' is not defined in the header
Inserting variants  [####################################]  100%          2022-12-16 15:58:13 hasta.scilifelab.se loqusdb.utils.load[175675] INFO Inserted 17580 variants of type sv
2022-12-16 15:58:13 hasta.scilifelab.se loqusdb.commands.load[175675] INFO Nr variants inserted: 241378
2022-12-16 15:58:13 hasta.scilifelab.se loqusdb.commands.load[175675] INFO Time to insert variants: 0:01:30.178186
2022-12-16 15:58:13 hasta.scilifelab.se loqusdb.plugins.mongo.adapter[175675] INFO All indexes exists

Copy link
Contributor

@henrikstranneheim henrikstranneheim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants