-
Notifications
You must be signed in to change notification settings - Fork 70
Rework python docs #454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework python docs #454
Conversation
…roiding and param_handling
very nice! some issues with notebook builds though |
WalkthroughThis pull request updates various documentation files to enhance clarity and consistency. It adds new Sphinx configuration options that enable strict reference checking and allow specific warnings to be ignored. Multiple user guide documents have been revised with improved phrasing, refined code examples, updated method signatures, and minor grammatical corrections, while new content—including glossary entries and normalization process instructions—has been introduced. Deprecated references have also been removed where applicable. Changes
Sequence Diagram(s)sequenceDiagram
participant U as User
participant C as conf.py
participant S as Sphinx Build
U->>C: Initiate Sphinx build
C-->>S: Load config (nitpicky=True, nitpick_ignore=[])
S->>S: Validate document references
alt Invalid Reference Found
S->>U: Emit warning
else
S->>U: Build documentation smoothly
end
sequenceDiagram
participant U as User
participant F as FeatureFinderAlgorithmPicked
participant V as TOPPView
U->>F: Instantiate FeatureFinder
F->>U: Process data and return out_features
U->>V: Load detected_features.featureXML
V->>U: Display feature details (m/z, RT, etc.)
Poem
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
docs/source/user_guide/ms_data.rst (1)
657-671
: Alternative Filtering Using PeakFileOptions
This new code block illustrates an alternative, more efficient method by using thePeakFileOptions
class to load only MS level 2 spectra. The example is clear and well annotated.
Suggestion: Change “chose to load” to “choose to load” for grammatical correctness.- # Alternatively, we can chose to load only spectra ... + # Alternatively, we can choose to load only spectra ...docs/source/user_guide/spectrum_merging.rst (1)
160-161
: Comparison of Merged Spectra with Different Scan Counts
The commentary (line 160) comparing the results of merging 10 scans versus 5 scans is very informative. It clearly notes the improved signal quality with 10 scans while cautioning against over-merging. A minor language tweak (e.g. “used previously” instead of “used before”) might further polish the text.docs/source/user_guide/feature_detection.rst (1)
250-264
: Enhanced 3D Plotting Function for Detected Features
The newly introduced functionplotDetectedFeatures3D
is a valuable enhancement. Usingitertools.cycle
to manage colors improves visual clarity in plots. Consider adding inline comments within the loop (e.g., explain the extraction ofretention_times
andintensities
) to further assist novice users.docs/source/user_guide/charge_isotope_deconvolution.rst (2)
93-128
: Demonstration of Updated Deisotoper Method Signature
The heavy peptide example effectively shows the new extended call tooms.Deisotoper.deisotopeAndSingleCharge(...)
with all its parameters. This example is comprehensive; however, adding brief inline comments next to key parameters (explaining their purpose) would enhance clarity for users who are less familiar with the API.
156-208
: Comprehensive Full Spectral De-Isotoping Example
This block thoroughly demonstrates the full spectral de-isotoping procedure on a BSA mass spectrum. The step-by-step process—including data loading, applying the deisotoping function, and plotting the results—is well structured. Due to the many parameters passed intodeisotopeAndSingleCharge
, consider adding concise inline comments to clarify the role of each boolean flag and numerical argument. This would be particularly helpful for users trying to adapt the example to their data.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
docs/source/conf.py
(1 hunks)docs/source/user_guide/adduct_detection.rst
(1 hunks)docs/source/user_guide/algorithms.rst
(2 hunks)docs/source/user_guide/centroiding.rst
(2 hunks)docs/source/user_guide/charge_isotope_deconvolution.rst
(6 hunks)docs/source/user_guide/chemistry.rst
(1 hunks)docs/source/user_guide/feature_detection.rst
(7 hunks)docs/source/user_guide/glossary.rst
(3 hunks)docs/source/user_guide/map_alignment.rst
(1 hunks)docs/source/user_guide/ms_data.rst
(2 hunks)docs/source/user_guide/parameter_handling.rst
(7 hunks)docs/source/user_guide/quantitative_data.rst
(2 hunks)docs/source/user_guide/smoothing.rst
(1 hunks)docs/source/user_guide/spectrum_merging.rst
(5 hunks)docs/source/user_guide/spectrum_normalization.rst
(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- docs/source/user_guide/smoothing.rst
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: build-test
- GitHub Check: build-test
🔇 Additional comments (39)
docs/source/user_guide/chemistry.rst (1)
8-8
: Good addition of cross-reference to related content.This cross-reference provides helpful navigation for users looking to learn more about isotope patterns, which is a relevant topic given the discussion of chemical concepts in this section.
docs/source/user_guide/adduct_detection.rst (2)
6-6
: Better scientific accuracy in describing charge states.The modification from "much higher" to "potentially higher" provides a more nuanced and accurate description of charge states in proteomics compared to metabolomics, avoiding overgeneralization.
9-9
: Improved precision in retention time description.Changing from "similar" to "almost identical" retention times more accurately represents the behavior of different adducts in liquid chromatography, providing better technical precision.
docs/source/user_guide/map_alignment.rst (2)
4-4
: Improved formatting consistency with terminology.The reformatting of the parentheses and consistent use of
:term:
directive for both "peak maps" and "feature maps" improves readability and ensures proper cross-referencing to the glossary.
11-17
:❓ Verification inconclusive
Verify removal of MapAlignmentAlgorithmSpectrumAlignment is intentional.
The removal of
:py:class:~.MapAlignmentAlgorithmSpectrumAlignment
from the list of available algorithms should be verified to ensure it was intentionally removed because it's deprecated or no longer available.
🏁 Script executed:
#!/bin/bash # Check if the MapAlignmentAlgorithmSpectrumAlignment class still exists in the codebase # or if it has been removed/renamed # Search for the class in Python files echo "Searching for MapAlignmentAlgorithmSpectrumAlignment in Python files:" rg -l "MapAlignmentAlgorithmSpectrumAlignment" --glob "*.py" # Also check for any mentions in documentation files echo "Searching for MapAlignmentAlgorithmSpectrumAlignment in documentation files:" rg -l "MapAlignmentAlgorithmSpectrumAlignment" --glob "*.rst" # Check for any deprecation warnings or comments related to this class echo "Checking for deprecation comments:" rg "MapAlignmentAlgorithmSpectrumAlignment.*deprecat" -iLength of output: 572
Action Required: Verify Intentional Removal of MapAlignmentAlgorithmSpectrumAlignment
The automated searches did not return any references or deprecation warnings for
MapAlignmentAlgorithmSpectrumAlignment
in Python or documentation files. However, the absence of output could be due to low signal quality from the search. Please manually verify—using version control history or additional checks—that its removal from the list of available map alignment algorithms was intentional (e.g., because it is deprecated or replaced).docs/source/user_guide/spectrum_normalization.rst (2)
34-35
: Good clarification on normalization process.Adding an explicit mention of the Normalizer class provides better context for the following code example and makes the documentation more comprehensive.
53-62
: Excellent addition of practical method discovery.This new section demonstrates how to programmatically discover available normalization methods, which is a valuable usability enhancement for developers who might want to explore different normalization options.
docs/source/conf.py (1)
49-52
:✅ Verification successful
Good addition of stricter reference checking.
Enabling the
nitpicky
option is a positive step for improving documentation quality by catching invalid references. This will help ensure that class names, cross-references, and other references in the documentation are valid and up-to-date.I'd recommend considering potential references to ignore in the
nitpick_ignore
list if there are any intentional references that would trigger warnings but should be kept.
🌐 Web query:
What types of references should be added to nitpick_ignore in Sphinx documentation?
💡 Result:
To effectively suppress non-critical warnings in Sphinx's nitpicky mode, these reference types should be added to
nitpick_ignore
inconf.py
:
Common Candidates for Ignoring
Built-in language types
- Python primitives:
('py:func', 'int')
,('py:class', 'str')
- JavaScript types:
('js:func', 'string')
,('js:func', 'boolean')
- Environment variables:
('envvar', 'LD_LIBRARY_PATH')
[1][4]Third-party library references
Example:('py:class', 'thirdparty_lib.UnusedClass')
Use when documenting code relying on external libraries not covered in your docs [9].Deprecated/legacy features
Example:('c:func', 'ioctl')
for outdated C API functions [5].MyST-specific warnings
Ignore cross-reference issues in Markdown-based docs:nitpick_ignore = [('myst', 'reference')]This suppresses
myst.xref_missing
warnings [3].Common unresolvable terms
Example:('std:term', 'interpreter')
if glossary entries are missing [7].
Advanced Configuration
For broader patterns, use
nitpick_ignore_regex
with regular expressions:nitpick_ignore_regex = [ (r'py:.*', r'foo.*bar\.B.*'), # Ignore Python entities matching regex (r'js:.*', r'^ArrayBuffer$') # Ignore specific JS types ]This suppresses warnings for references like
('py:const', 'foo_package.bar.BAZ_VALUE')
[1][2].
Best Practices
- Enable
nitpicky = True
during development to catch broken links [1][4].- Audit warnings periodically – fix resolvable issues rather than ignoring them [7].
- Combine with
sphinx.ext.intersphinx
to validate cross-project references [6].Example minimal configuration:
nitpicky = True nitpick_ignore = [ ('py:func', 'open'), # Built-in function ('py:class', 'pathlib.Path'), # Assumed known type ('envvar', 'DISPLAY') # Common Unix env variable ]Citations:
- 1: https://www.sphinx-doc.org/en/master/usage/configuration.html
- 2: https://github.com/sphinx-doc/sphinx/blob/master/doc/usage/configuration.rst
- 3: https://myst-parser.readthedocs.io/en/latest/syntax/cross-referencing.html
- 4: https://documatt.com/blog/21/sphinx-conf-py-tips/
- 5: https://discuss.python.org/t/confused-by-warnings-from-doc-library-fcntl-rst/46367
- 6: https://docs.readthedocs.com/platform/stable/guides/cross-referencing-with-sphinx.html
- 7: https://discuss.python.org/t/broken-references-in-sphinx-docs/19463
- 8: https://sphinx-rtd-trial.readthedocs.io/en/latest/config.html
- 9: sphinx-doc/sphinx#6895
- 10: https://www.sphinx-doc.org/en/master/usage/referencing.html
Stricter Reference Checking Enabled – Optional Enhancements for nitpick_ignore
The configuration is a strong step toward catching invalid references. As a further enhancement, consider pre-populating the
nitpick_ignore
list with common, non-critical references to suppress intentional warnings. For example:
Built-in language types:
e.g.,('py:func', 'open')
,('py:class', 'str')
Third-party library references:
e.g.,('py:class', 'thirdparty_lib.UnusedClass')
Deprecated or legacy features:
e.g.,('c:func', 'ioctl')
Additionally, if you encounter multiple similar warnings, you might explore using
nitpick_ignore_regex
for broader pattern matching.The affected code remains as follows:
## warn about invalid references (e.g. invalid class names) nitpicky = True nitpick_ignore = []Overall, your changes enhance documentation quality. Consider these optional adjustments if you need to suppress warnings for intentional references.
docs/source/user_guide/algorithms.rst (3)
10-15
: Improved Experiment Population & Algorithm Execution Example
The added blank line and comments (lines 10–15) clarify the steps for populating the experiment and running the algorithm. This extra spacing and explicit instruction enhances readability.
18-21
: Enhanced Parameter Handling Explanation
The revised wording now clearly explains that the algorithm’s parameters can be retrieved, manipulated, and then re‐applied usinggetParameters()
andsetParameters()
. The inline code formatting improves the clarity.
31-35
: Reiterated Experiment Population for Consistency
Repeating the “populate experiment” step (lines 31–35) reinforces the procedure, making the example self-contained. This change is clear and helps maintain consistency throughout the documentation.docs/source/user_guide/ms_data.rst (2)
654-655
: Clarifying Output of MS Level Filtering
The added comment (# 'filtered' now only contains spectra with MS level >= 2) immediately after the loop adds useful context for the reader about the filtering result.
716-733
: Filtering by m/z Range with PeakFileOptions
The added comment at the end of the manual filtering example (line 716) and the subsequent alternative example (lines 717–733) clearly explain how to restrict peaks based on m/z values while simultaneously filtering MS level. This addition improves the tutorial’s comprehensiveness.docs/source/user_guide/centroiding.rst (3)
36-37
: Ensuring Plot Display with plt.show()
The insertion ofplt.show()
(lines 36–37) ensures that the profile data plot is rendered. This change is essential for interactive sessions and improves the user experience.
40-47
: Expanded Explanation of m/z Measurement Variability
The new descriptive text (lines 40–47) offers a more detailed explanation of the spread in m/z measurements and the origin of isotopic patterns. This improved narrative helps users understand the limitations and behavior of mass spectrometry data.
65-66
: Display of Centroided Data via plt.show()
Addingplt.show()
after the stem plot (lines 65–66) guarantees that the centroided spectrum is visualized. This small yet vital change improves the code’s runnability.docs/source/user_guide/glossary.rst (5)
38-39
: Enhanced Electrospray Ionization (ESI) Entry
The additions clarify that ESI typically produces peptides with a charge of 2 or higher and that it is commonly coupled with Orbitrap and FTICR instruments. This extra detail provides valuable context for users.
86-89
: New MALDI Entry with Clear Instrument Context
The new MALDI entry now explains that MALDI uses a laser energy-absorbing matrix, contrasts it with ESI, and notes the typical charge state and instrument association (TOF). This comprehensive description is useful for new users.
96-100
: Definition of Mass Spectra
Introducing the term “mass spectra” with a clear definition—as either a visual or numerical representation including m/z and intensity pairs—makes the glossary more complete and informative.
145-150
: Clear Introduction of Orbitrap
The new Orbitrap glossary entry provides a concise technical description and highlights its ultra-high resolution capability. This definition aligns well with contemporary MS literature.
151-154
: Addition of Peak Maps Definition
The glossary now includes “peak maps” (and its singular form “peak map”), defining them as collections of mass spectra and/or chromatograms sorted by retention time. This addition enhances the glossary’s scope and is consistent with its overall style.docs/source/user_guide/spectrum_merging.rst (4)
5-6
: Clarification of Merging vs. Averaging Concepts
The revised sentence added at line 5 explains that spectra merging (for instance, merging MS1 spectra within a retention time window) differs from spectra averaging (which preserves the number of spectra). This clear distinction will help users choose the appropriate method for their data.
100-101
: Detail on Adjustable Block Size Parameter
The insertion at line 100 highlighting that the block size can be adjusted via theblock_method:rt_block_size
parameter is a valuable detail. It encourages users to experiment with different block sizes and consult further documentation if needed.
198-199
: Explanation of Precursor Tolerance Adjustments for MS2 Merging
The added lines (198–199) provide a detailed rationale for adjusting the retention time and m/z tolerances during precursor-based merging of MS2 spectra. This explanation helps users understand why certain spectra may not merge by default and how to adjust parameters accordingly.
296-299
: Improved Description of Averaging Methods
The updated explanation in the averaging section now clearly distinguishes between the “gaussian” and “tophat” methods. It notes that the Gaussian method applies weighted averaging (with weights that decrease sharply from the center), while the Tophat method uses equal weights. This refinement enhances user understanding of the trade-offs involved.docs/source/user_guide/quantitative_data.rst (3)
4-4
: Section Header Update for “Features”
The heading has been updated to “Features” (capitalized), which matches stylistic guidelines and improves clarity.
45-47
: Improved Description of Visualization in TOPPView
The rephrasing now explicitly explains that opening the feature map in TOPPView displays two features (with their positions and hover details). This added clarity makes the expected behavior clearer to the user.
51-53
: Clear and Concise Note on Feature Count
Rewording the note to “In the above example, only two features are present. In a typical :term:LC-MS/MS
experiment, you can expect thousands of features.” improves readability and sets accurate expectations.docs/source/user_guide/parameter_handling.rst (5)
5-10
: Enhanced Explanation of Parameter Handling
The introduction now emphasizes that parameter handling is achieved via inheritance fromDefaultParamHandler
, and it clearly explains access to defaults, current parameters, and setting new ones. The description using the :py:class:~.Param
object is both informative and accurate.
11-15
: Detailed Method Descriptions for Parameter Access
The explanation regarding the usage ofgetDefaults
,getParameters
, and especially the emphasis on using :py:meth:~.Param.exists
to avoid runtime errors has been well elaborated.
38-40
: Alternate Access Methods Illustrated Clearly
The new text “The parameters can also be accessed as” followed by a concise pycon example helps users understand multiple access patterns. This section is clear and useful for interactive sessions.
54-64
: Comprehensive Example for Copying and Merging Param Objects
The code snippet demonstrating how to copy and merge :py:class:~.Param
objects (including a deep copy pattern and subsequent merging) is clear and educational. Consider adding a brief inline comment next tonew_p = p # new deep copy of p
if further clarification is desired, but overall the example meets its objective.
105-106
: Access and Removal Operations are Well Documented
The sections on listing parameters with descriptions and using methods like.remove()
and.removeAll()
are detailed and cover common usage scenarios. The examples align well with the user guide’s goals.docs/source/user_guide/feature_detection.rst (4)
8-9
: Consistent Cross-Referencing
Updating the reference from the previous section to<charge_isotope_deconvolution.html>
is consistent with related documentation. This helps maintain a coherent narrative across the guides.
11-18
: Updated FeatureFinder Descriptions
The updated list of available feature finders (with names such as :py:class:~.FeatureFinderAlgorithmPicked
and :py:class:~.FeatureFinderMultiplexAlgorithm
) is now more accurate and informative. The enumeration clearly distinguishes proteomics from metabolomics tools.
28-37
: Practical Code Example for Proteomics Feature Detection
The sample code that downloads input data, loads an MS experiment, and runs the feature finder (using :py:class:~.FeatureFinderAlgorithmPicked
) is practical and detailed. Storing the resulting feature map and updating unique IDs makes the example realistic and useful for users.
241-243
: Clarifying Output File Note
Updating the note to reference the output file asdetected_features.featureXML
is clear and consistent with the revised workflow. The reference to the section on MS data formats further aids understanding.docs/source/user_guide/charge_isotope_deconvolution.rst (2)
38-44
: New Isotope Distribution Example Using AASequence and EmpiricalFormula
The new code example that computes the isotope distribution using :py:class:~.AASequence
and :py:class:~.EmpiricalFormula
is clear and educational. The use ofoms.Deisotoper.deisotopeAndSingleChargeDefault(s, 10, True)
correctly reflects the updated API. Please verify that this function signature (especially the additional Boolean parameter) remains in sync with the current API documentation.
202-207
: Effective Visualization of Deisotoped Peaks
The plotting sections comparing the unpicked and deisotoped data are clear and enhance understanding of the processing outcome. The use ofplt.bar
for visual representation is appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
docs/source/user_guide/charge_isotope_deconvolution.rst (4)
6-10
: Enhanced Description of Heavy Isotopes
The additional text clarifying the natural abundance of heavy isotopes (e.g. carbon‑13 and hydrogen‑2) improves the narrative. However, there are minor phrasing issues – for example, consider changing "lesser extend" to "lesser extent" and "very low abundant" to "very low in abundance" for better readability.
12-17
: Clarification on Isotopic Pattern
The expanded explanation of the isotopic pattern, detailing the monoisotopic peak and successive isotopic peaks, is helpful. Please double-check the spelling in line 15 where "dimishingly" should be "diminishingly."
110-127
: Detailed Deisotoping Function Call
The expanded call tooms.Deisotoper.deisotopeAndSingleCharge
now exposes the full set of parameters. Given the long parameter list, consider adding inline comments or refactoring (e.g., using keyword arguments) to improve readability and maintainability.
202-204
: Visualization of Unpicked Peak Data
This snippet that creates a bar plot for the unpicked peak data is a valuable addition. To enhance clarity for end users, consider labeling the axes or adding a brief description of what the data represents.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/source/user_guide/charge_isotope_deconvolution.rst
(7 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: build-test
- GitHub Check: build-test
🔇 Additional comments (15)
docs/source/user_guide/charge_isotope_deconvolution.rst (15)
18-23
: Definition of the Monoisotopic Peak
This section provides a clear and concise definition of the monoisotopic peak and its relevance in peptides and proteins. A slight rewording for improved flow might be beneficial, but the explanation is overall effective.
26-26
: Enhanced Explanation of Charge States and Adducts
The modifications in these lines enrich the discussion on charge states and adduct formation, covering both singly and doubly charged analytes as well as the impact of incomplete tryptic digestion. It might be beneficial to briefly mention how these changes affect subsequent data analysis.Also applies to: 28-28, 31-33
38-40
: Introduction to Isotope Distribution Code Example
The introductory text for the upcoming code example is precise and sets clear expectations on demonstrating the usage of theAASequence
,EmpiricalFormula
, andDeisotoper
classes. Ensure that these classes are documented elsewhere for full clarity.
47-47
: Output Statement for Monoisotopic Weight
The print statement in line 47 effectively displays the calculated monoisotopic weight and serves well in illustrating the code example.
49-49
: Inline Comment on Isotopic Distribution
The inline comment clarifying the adjustment for two additional hydrogens is clear and informative.
56-57
: Adjusting m/z Values in the Loop
The loop that divides the m/z by the charge to adjust the values is correctly implemented. It would be prudent to verify its behavior across Python versions (with respect to integer vs. float division), although for documentation purposes this is acceptable.
65-65
: Printing Monoisotopic Peaks
The loop that prints the mono peaks is straightforward and effectively serves its illustrative purpose.
73-80
: Expected Output for the Single Peak Example
The output code block here successfully demonstrates the expected results of the computation, providing a tangible reference for users.
85-87
: Heuristic Parameter Explanation
The description regarding the heuristic parameters (use_decreasing_model
andstart_intensity_check
) adds valuable context and clarity to the algorithm’s behavior.
94-96
: Heavy Peptide Example Update
The updated heavy peptide example, including the adjusted print statement and the use ofcharge = 4
, is clear. This change effectively demonstrates handling larger peptides.
130-130
: Printing Mono Peaks in Heavy Peptide Example
The print statement in line 130 continues to clearly illustrate the outcome of the heavy peptide deconvolution example.
133-144
: Expected Output for the Heavy Peptide Example
The output block presents expected results efficiently, aiding users in verifying the correctness of the algorithm.
160-160
: Visualization Import for BSA Example
The addition of theimport matplotlib.pyplot as plt
statement enhances the documentation by introducing visualization aspects.
184-184
: Updated Boolean Parameter in Deisotoping Call (BSA Example)
The inclusion of the additionalTrue
parameter at the end of the function call reflects the updated method signature. Please verify that this parameter consistently controls the intended behavior as described in the documentation.
206-208
: Visualization of Picked Peak Data
The example for visualizing the picked peaks is clear and effective, rounding out the visualization examples nicely.
Summary by CodeRabbit