Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
44d014c
make quantscore flexible for precursor or protein groups or other
mlocardpaulet Feb 19, 2026
4d32279
modify PyGithub dependencies for pull request
mlocardpaulet Feb 19, 2026
d374466
generalise precursor-to-feature in all modules
mlocardpaulet Feb 19, 2026
cfdaa94
add pg module in MODULE_SETTINGS_DIRS
mlocardpaulet Feb 19, 2026
f46ec0b
add pg module in MODULE_TO_CLASS
mlocardpaulet Feb 19, 2026
22148e0
add diann input parser for pg module
mlocardpaulet Feb 19, 2026
5a1944f
add tomls for parsing
mlocardpaulet Feb 19, 2026
f55ffd6
add pg module in parse_settings_files.toml
mlocardpaulet Feb 19, 2026
0c47002
add module in webinterface
mlocardpaulet Feb 19, 2026
16e34f9
add jupyter notebook for testing
mlocardpaulet Feb 19, 2026
faeeb4a
add optional use_github=False
mlocardpaulet Feb 20, 2026
c4f3a37
align all modules to use "features" and not "precursors"
mlocardpaulet Feb 20, 2026
257bf2f
add module-specific parameter for y-axis name
mlocardpaulet Feb 20, 2026
70c5d64
amend precedent commit
mlocardpaulet Feb 20, 2026
f037ef2
Add markdown docs for web interface
mlocardpaulet Feb 20, 2026
890bec3
set up mapper to get species from pg column in diann
mlocardpaulet Feb 20, 2026
7504af9
add AlphaDIA
mlocardpaulet Feb 20, 2026
e7fe92c
black reformating
mlocardpaulet Feb 20, 2026
caca41b
Merge branch 'main' into pg_tests
mlocardpaulet Feb 20, 2026
371e858
add fallback when no datapoint - for dev
mlocardpaulet Feb 23, 2026
2e402e5
change int to str for first element of create_replicate_mapping
mlocardpaulet Feb 23, 2026
34e6b1b
Revert type annotation changes
mlocardpaulet Feb 23, 2026
69435ba
add docs in in-dev folder
mlocardpaulet Feb 23, 2026
c5d5d0c
make spectronaut compatible
mlocardpaulet Feb 24, 2026
7b9b97e
cleanup quantscores.py
mlocardpaulet Feb 24, 2026
75ac8eb
fix args order
mlocardpaulet Feb 24, 2026
98cb9f3
update notebook
mlocardpaulet Feb 24, 2026
27da96b
add protein-group specific metrics in data_point
mlocardpaulet Feb 24, 2026
98829d1
add custom format
mlocardpaulet Feb 24, 2026
066c315
add custom in parse_settings_files.toml
mlocardpaulet Feb 24, 2026
dc25350
Merge branch 'main' into pg_tests
mlocardpaulet Feb 24, 2026
6aefb6c
Merge branch 'pg_tests' of https://github.com/Proteobench/ProteoBench…
mlocardpaulet Feb 24, 2026
cca69a8
allow backward compatibility of datapoint when nr_prec instead of nr_…
mlocardpaulet Feb 25, 2026
c22e6bc
debug and clean up
mlocardpaulet Feb 25, 2026
defbdc4
fix feature name in pg module add level info in toml with yaxis title
mlocardpaulet Feb 25, 2026
f5ec59e
pass ion level in module tomls
mlocardpaulet Feb 25, 2026
de39787
fix y_axis_title in toml. Not used yet
mlocardpaulet Feb 25, 2026
e66a3be
PG module works others broken when submit new data
mlocardpaulet Feb 25, 2026
2e8f1cf
black formating
mlocardpaulet Feb 25, 2026
085292b
fix issue for DDA modules
mlocardpaulet Feb 25, 2026
54d51b2
make double upload of alphaDIA module specific
mlocardpaulet Feb 25, 2026
29c474e
Remove some debug print statements.
mlocardpaulet Feb 25, 2026
5e768eb
black formating
mlocardpaulet Feb 25, 2026
9ad12ce
add test material - does not work yet
mlocardpaulet Feb 25, 2026
0bece9b
Update test_module_quant_proteingroup_DIA_Astral.py
mlocardpaulet Feb 25, 2026
fc7ed1b
Merge branch 'main' into pg_tests
mlocardpaulet Feb 27, 2026
e2f65e7
Merge branch 'main' into pg_tests
mlocardpaulet Mar 3, 2026
f0a0d64
Merge remote-tracking branch 'origin/main' into pg_tests
mlocardpaulet Mar 4, 2026
c708224
Merge origin/main into pg_tests
mlocardpaulet Mar 26, 2026
82c1df7
fix parenthesis loss
mlocardpaulet Mar 27, 2026
f4b8fbf
update script following merge
mlocardpaulet Mar 27, 2026
e520a34
fix jupyter notebook
mlocardpaulet Mar 27, 2026
22423cb
few fixes
mlocardpaulet Mar 27, 2026
2b4adfd
Merge branch 'main' into pg_tests
mlocardpaulet Mar 30, 2026
b7bafd1
Merge branch 'main' into pg_tests
mlocardpaulet Mar 31, 2026
de8b482
update jupyter notebook
mlocardpaulet Apr 14, 2026
1105fcb
Merge remote-tracking branch 'origin/main' into pg_tests
mlocardpaulet Apr 14, 2026
88e2502
Merge branch 'main' into pg_tests
mlocardpaulet Apr 14, 2026
86f4752
add accession mapper for uniqueness
mlocardpaulet Apr 14, 2026
5c6247e
update mapping table
mlocardpaulet Apr 14, 2026
93e770a
Merge branch 'main' into pg_tests
mlocardpaulet May 26, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view

Large diffs are not rendered by default.

26,666 changes: 26,666 additions & 0 deletions jupyter_notebooks/dev_tests/test_astral_pg_module_notebook.ipynb

Large diffs are not rendered by default.

34 changes: 18 additions & 16 deletions proteobench/datapoint/quant_datapoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,9 +61,10 @@ def filter_df_numquant_epsilon(
return None


def filter_df_numquant_nr_prec(row: pd.Series, min_quant: int = 3) -> int | None:
def filter_df_numquant_nr_feature(row: pd.Series, min_quant: int = 3) -> int | None:
"""
Extract the 'nr_prec' value from a row (assumed to be a dictionary).
Extract the 'nr_feature' value from a row (assumed to be a dictionary).
Falls back to 'nr_prec' for backward compatibility with legacy data.

Parameters
----------
Expand All @@ -75,12 +76,13 @@ def filter_df_numquant_nr_prec(row: pd.Series, min_quant: int = 3) -> int | None
Returns
-------
int, None
The 'nr_prec' value if found, otherwise None.
The 'nr_feature' or 'nr_prec' value if found, otherwise None.
"""
if isinstance(list(row.keys())[0], str):
min_quant = str(min_quant)
if isinstance(row, dict) and min_quant in row and isinstance(row[min_quant], dict):
return row[min_quant].get("nr_prec")
# Try nr_feature first (new standard), then nr_prec (legacy)
return row[min_quant].get("nr_feature") or row[min_quant].get("nr_prec")
return None


Expand Down Expand Up @@ -284,7 +286,7 @@ class QuantDatapointHYE(DatapointBase):
mean_abs_epsilon_precision_global (float): Mean absolute precision epsilon (deviation from empirical center).
median_abs_epsilon_precision_eq_species (float): Median absolute precision epsilon for equivalently weighted species.
mean_abs_epsilon_precision_eq_species (float): Mean absolute precision epsilon for equivalently weighted species.
nr_prec (int): Number of precursors identified.
nr_feature (int): Number of features identified.
comments (str): Any additional comments.
proteobench_version (str): Version of the Proteobench tool used.
"""
Expand Down Expand Up @@ -315,7 +317,7 @@ class QuantDatapointHYE(DatapointBase):
mean_abs_epsilon_precision_global: float = 0
median_abs_epsilon_precision_eq_species: float = 0
mean_abs_epsilon_precision_eq_species: float = 0
nr_prec: int = 0
nr_feature: int = 0
comments: str = ""
proteobench_version: str = ""

Expand Down Expand Up @@ -348,7 +350,7 @@ def generate_datapoint(
The format of the input data (e.g., file format).
user_input : dict
User-defined input values for the benchmark.
default_cutoff_min_prec : int, optional
default_cutoff_min_feature : int, optional
The default minimum precursor cutoff value. Defaults to 3.
max_nr_observed : int, optional
Maximum nr_observed value to calculate metrics for. If None, defaults to 6.
Expand Down Expand Up @@ -408,31 +410,31 @@ def generate_datapoint(
)
)
result_datapoint.results = results
result_datapoint.median_abs_epsilon_global = result_datapoint.results[default_cutoff_min_prec][
result_datapoint.median_abs_epsilon_global = result_datapoint.results[default_cutoff_min_feature][
"median_abs_epsilon_global"
]
result_datapoint.mean_abs_epsilon_global = result_datapoint.results[default_cutoff_min_prec][
result_datapoint.mean_abs_epsilon_global = result_datapoint.results[default_cutoff_min_feature][
"mean_abs_epsilon_global"
]
result_datapoint.median_abs_epsilon_eq_species = result_datapoint.results[default_cutoff_min_prec][
result_datapoint.median_abs_epsilon_eq_species = result_datapoint.results[default_cutoff_min_feature][
"median_abs_epsilon_eq_species"
]
result_datapoint.mean_abs_epsilon_eq_species = result_datapoint.results[default_cutoff_min_prec][
result_datapoint.mean_abs_epsilon_eq_species = result_datapoint.results[default_cutoff_min_feature][
"mean_abs_epsilon_eq_species"
]
result_datapoint.median_abs_epsilon_precision_global = result_datapoint.results[default_cutoff_min_prec][
result_datapoint.median_abs_epsilon_precision_global = result_datapoint.results[default_cutoff_min_feature][
"median_abs_epsilon_precision_global"
]
result_datapoint.mean_abs_epsilon_precision_global = result_datapoint.results[default_cutoff_min_prec][
result_datapoint.mean_abs_epsilon_precision_global = result_datapoint.results[default_cutoff_min_feature][
"mean_abs_epsilon_precision_global"
]
result_datapoint.median_abs_epsilon_precision_eq_species = result_datapoint.results[default_cutoff_min_prec][
result_datapoint.median_abs_epsilon_precision_eq_species = result_datapoint.results[default_cutoff_min_feature][
"median_abs_epsilon_precision_eq_species"
]
result_datapoint.mean_abs_epsilon_precision_eq_species = result_datapoint.results[default_cutoff_min_prec][
result_datapoint.mean_abs_epsilon_precision_eq_species = result_datapoint.results[default_cutoff_min_feature][
"mean_abs_epsilon_precision_eq_species"
]
result_datapoint.nr_prec = result_datapoint.results[default_cutoff_min_prec]["nr_prec"]
result_datapoint.nr_feature = result_datapoint.results[default_cutoff_min_feature]["nr_feature"]

results_series = pd.Series(dataclasses.asdict(result_datapoint))

Expand Down
48 changes: 45 additions & 3 deletions proteobench/github/gh.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,15 @@

import pandas as pd
from git import Repo, exc
from github import Github

# Make GitHub functionality optional
try:
from github import Github

GITHUB_AVAILABLE = True
except ImportError:
Github = None
GITHUB_AVAILABLE = False

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -38,6 +46,11 @@ class GithubProteobotRepo:
A class to interact with GitHub repositories related to Proteobot and Proteobench,
allowing cloning, committing, and creating pull requests.

Note
----
Pull request functionality requires PyGithub to be installed.
Repository cloning and local Git operations work without PyGithub.

Parameters
----------
token : str | None, optional
Expand Down Expand Up @@ -91,6 +104,18 @@ def __init__(
self.branch = branch
self.repo = None

@staticmethod
def is_github_available() -> bool:
"""
Check if PyGithub is available for GitHub API operations.

Returns
-------
bool
True if PyGithub is available, False otherwise.
"""
return GITHUB_AVAILABLE

def get_remote_url_anon(self) -> str:
"""
Return the remote URL of the repository to be cloned anonymously (public access).
Expand Down Expand Up @@ -245,9 +270,9 @@ def read_results_json_repo(self) -> pd.DataFrame:
data.append(pd.read_json(f, typ="series"))
if not data:
try:
self.read_results_json_repo_single_file()
return self.read_results_json_repo_single_file()
except FileNotFoundError:
data = []
raise FileNotFoundError("No JSON data files found in repository and no results.json fallback available")

return pd.DataFrame(data)

Expand Down Expand Up @@ -371,7 +396,24 @@ def create_pull_request(self, commit_name: str, commit_message: str, submission_
-------
int
The pull request number assigned by GitHub.

Raises
------
ImportError
If PyGithub is not installed.
ValueError
If no GitHub token is provided.
"""
if not GITHUB_AVAILABLE:
raise ImportError(
"PyGithub is not installed. Please install it with: " "pip install PyGithub or conda install pygithub"
)

if not self.token:
raise ValueError(
"GitHub token is required for creating pull requests. " "Please provide a valid GitHub token."
)

g = Github(self.token)
repo = g.get_repo(self.proteobot_repo_name)
base = repo.get_branch("master")
Expand Down
142 changes: 142 additions & 0 deletions proteobench/io/params/json/Quant/quant_lfq_DIA_proteingroup.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
{
"software_name": {
"type": "text_input",
"label": "Software name",
"placeholder": "None"
},
"software_version": {
"type": "text_input",
"label": "Software tool version",
"placeholder": "1.0"
},
"search_engine": {
"type": "text_input",
"label": "Search engine name",
"placeholder": "None"
},
"search_engine_version": {
"type": "text_input",
"label": "Search engine version",
"placeholder": "None"
},
"ident_fdr_psm": {
"type": "text_input",
"label": "FDR psm",
"placeholder": "None"
},
"ident_fdr_peptide": {
"type": "text_input",
"label": "FDR peptide",
"placeholder": "None"
},
"ident_fdr_protein": {
"type": "text_input",
"label": "FDR protein",
"placeholder": "None"
},
"enable_match_between_runs": {
"type": "checkbox",
"label": "Quantified with MBR",
"value": false
},
"precursor_mass_tolerance": {
"type": "text_input",
"label": "Precursor mass tolerance (including unit ppm, PPM or Da)",
"placeholder": "None"
},
"fragment_mass_tolerance": {
"type": "text_input",
"label": "Fragment mass tolerance (including unit ppm, PPM or Da)",
"placeholder": "None"
},
"enzyme": {
"type": "text_input",
"label": "Proteolytic Enzyme",
"placeholder": "None"
},
"allowed_miscleavages": {
"type": "text_input",
"label": "Maximum allowed number of missed cleavage",
"placeholder": "None"
},
"min_peptide_length": {
"type": "text_input",
"label": "Minimum peptide length",
"placeholder": "None"
},
"max_peptide_length": {
"type": "text_input",
"label": "Maximum peptide length",
"placeholder": "None"
},
"fixed_mods": {
"type": "text_input",
"label": "Specify the fixed mods that were set",
"placeholder": "None"
},
"variable_mods": {
"type": "text_input",
"label": "Specify the variable mods that were set (separated by a comma)",
"placeholder": "None"
},
"max_mods": {
"type": "text_input",
"label": "Maximum number of modifications",
"placeholder": "None"
},
"min_precursor_charge": {
"type": "text_input",
"label": "Minimum precursor charge allowed",
"placeholder": "None"
},
"max_precursor_charge": {
"type": "text_input",
"label": "Maximum precursor charge allowed",
"placeholder": "None"
},
"min_precursor_mz": {
"type": "text_input",
"label": "Minimum precursor m/z",
"placeholder": "None"
},
"max_precursor_mz": {
"type": "text_input",
"label": "Maximum precursor m/z",
"placeholder": "None"
},
"min_fragment_mz": {
"type": "text_input",
"label": "Minimum fragment m/z",
"placeholder": "None"
},
"max_fragment_mz": {
"type": "text_input",
"label": "Maximum fragment m/z",
"placeholder": "None"
},
"quantification_method": {
"type": "text_input",
"label": "Quantification method",
"placeholder": "None"
},
"protein_inference": {
"type": "text_input",
"label": "Protein inference method",
"placeholder": "None"
},
"abundance_normalization_ions": {
"type": "text_input",
"label": "Abundance normalization method",
"placeholder": "None"
},
"predictors_library": {
"type": "text_input",
"label": "Utilized spectral library",
"placeholder": "None"
},
"scan_window": {
"type": "text_input",
"label": "Window scanning size",
"placeholder": "None"
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,5 @@

[general]
"min_count_multispec" = 1
"level" = "ion"
"level" = "precursor ion"
"y_axis_title" = "Total number of precursor ions quantified in the selected number of raw files"
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,5 @@

[general]
"min_count_multispec" = 1
"level" = "ion"
"level" = "precursor ion"
"y_axis_title" = "Total number of precursor ions quantified in the selected number of raw files"
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,4 @@
[general]
"min_count_multispec" = 1
"level" = "peptidoform"
"y_axis_title" = "Total number of peptidoforms quantified in the selected number of raw files"
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,5 @@

[general]
"min_count_multispec" = 1
"level" = "ion"
"level" = "precursor ion"
"y_axis_title" = "Total number of precursor ions quantified in the selected number of raw files"
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,5 @@

[general]
"min_count_multispec" = 1
"level" = "ion"
"level" = "precursor ion"
"y_axis_title" = "Total number of precursor ions quantified in the selected number of raw files"
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,5 @@

[general]
"min_count_multispec" = 1
"level" = "ion"
"level" = "precursor ion"
"y_axis_title" = "Total number of precursor ions quantified in the selected number of raw files"
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,5 @@

[general]
"min_count_multispec" = 1
"level" = "ion"
"level" = "precursor ion"
"y_axis_title" = "Total number of precursor ions quantified in the selected number of raw files"
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,5 @@

[general]
"min_count_multispec" = 1
"level" = "ion"
"level" = "precursor ion"
"y_axis_title" = "Total number of precursor ions quantified in the selected number of raw files"
Loading
Loading