Skip to content

Multi-species protein group module#960

Draft
mlocardpaulet wants to merge 61 commits into
mainfrom
pg_tests
Draft

Multi-species protein group module#960
mlocardpaulet wants to merge 61 commits into
mainfrom
pg_tests

Conversation

@mlocardpaulet

Copy link
Copy Markdown
Contributor

Hey, this is very much a test. Not sure about my coding but I wanted to try.
It is not done, for sure.

Also, I made changes to the quantscore to make it flexible (not only for precursors). I am not sure it is the best idea.

@mlocardpaulet mlocardpaulet self-assigned this Feb 19, 2026
@mlocardpaulet mlocardpaulet marked this pull request as draft February 19, 2026 17:59
@mlocardpaulet

mlocardpaulet commented Feb 19, 2026

Copy link
Copy Markdown
Contributor Author

I would very much like comments etc....

  • I know that the datapoints are not perfect (for example there are still mentions of precursors when it should be protein groups).
  • we have to discuss metrics (what to plot on the y axis).
  • we want to add specific calculations: number of unique accession per group for example, and one to check if there is parcimony (number of duplicated accessions across groups)
  • I only added DIA-NN outputs (pg_matrix)
  • I think that the contaminants are not filtered out, for this we should use the Protein.Group identifiers

@mlocardpaulet

mlocardpaulet commented Feb 20, 2026

Copy link
Copy Markdown
Contributor Author
  • module information is now used for some functions (in particular the type of feature considered: prec, pg, peptidoform). This is used for some webinterface functions (for example to know that the AlphaDIA input is 2 files only for precursors, not for protein groups module). The title of the main plot also adapts to this information. Not sure this is brilliant. Happy to change.

  • I added a "use_github" param to the QuantModule class (bool, optional = whether to clone the GitHub repository. Defaults to True) to test the module without having the repositories set up. Happy to remove it if this is not a good idea.

  • in depth plots do not work

  • Full path in AlphaDIA input is not removed yet (I manually changed the headers in the file I use for testing)

  • need to implement tests

  • need to write documentation

  • need to define parameters to parse that would be of interest for protein groups

  • need to add other tools (Spectronaut)

  • need to calculate specific metrics that could inform on accessions granularity (number of unique accession per protein group?, number of accessions found in different groups?)

  • Check that batch resubmission is setup for this module

  • check that y_axis_title works (y axis title depends on the module)

  • check that the use_github works and does not create issue. -> I am sure that it is not consistent, there are errors. So decide if we keep or fix.

@enryH

enryH commented Feb 21, 2026

Copy link
Copy Markdown
Member

https://github.com/orgs/Proteobench/discussions/253

(apparently mentions in discussion are not automatically cross-referenced - whereas issues are)

@mlocardpaulet

Copy link
Copy Markdown
Contributor Author

commit add fallback when no datapoint - for dev: I had issues plotting in-depth plot without github access. I added a fallback option that allows plotting of most recently uploaded point. -> Can be removed if this is a bad idea.

@mlocardpaulet

Copy link
Copy Markdown
Contributor Author

Kind of works. I wanted to have a y_axis_title that could be set up in the modules toml and would define the y-axis title of the main plots. But id does not work and I am not sure that this is a good idea

# Conflicts:
#	proteobench/datapoint/quant_datapoint.py
#	proteobench/io/parsing/parse_settings.py
#	proteobench/modules/constants.py
#	proteobench/modules/quant/benchmarking.py
#	proteobench/modules/quant/quant_base_module.py
#	proteobench/modules/quant/quant_lfq_ion_DIA_AIF.py
#	proteobench/modules/quant/quant_lfq_ion_DIA_Astral.py
#	proteobench/modules/quant/quant_lfq_ion_DIA_ZenoTOF.py
#	proteobench/modules/quant/quant_lfq_ion_DIA_diaPASEF.py
#	proteobench/modules/quant/quant_lfq_ion_DIA_singlecell.py
#	proteobench/modules/quant/quant_lfq_peptidoform_DDA.py
#	proteobench/modules/quant/quant_lfq_peptidoform_DIA.py
#	webinterface/pages/base_pages/quant.py
#	webinterface/pages/base_pages/utils/filter.py
#	webinterface/pages/base_pages/utils/metricplot.py
@mlocardpaulet

Copy link
Copy Markdown
Contributor Author

During the in-person meeting, we decided that for clean comparison, and to know what expected ratio to match to each protein group, I will only consider the protein accessions that have no shared peptide in theory (according to the fasta file). And we will only report the accuracy error for this. With the number of accessions on the y-axis.

@mlocardpaulet

Copy link
Copy Markdown
Contributor Author

TOFIX: default_cutoff_min_prec and default_cutoff_min_feature

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants