Skip to content

ENH: add TPR position and velocity read support #4873

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: develop
Choose a base branch
from

Conversation

tylerjereddy
Copy link
Member

@tylerjereddy tylerjereddy commented Dec 29, 2024

Fixes gh-464.

  • Add support for reading positions and velocities from GMX .tpr files over the full range of tpx versions supported by regular .tpr topology parsing (58 through 137). We've had an issue open to add this capability for almost a decade now.

Notes for reviewers:

  1. Suggestions for removing the use of global to track TPR file precision are appreciated.
  2. Performance/duplication considerations -- as Richard noted in the cognate issue, the way our topology and coordinate handling code is organized, it may make sense to keep the separation here, but there should at least be opportunity to reduce code duplication if not completely avoid re-reading the topology to seek to the coordinate positions); that said, our .tpr reading performance is atrocious anyway and my attempts to fix at WIP, ENH: faster TPR topology building for large systems #4098 have stalled for more than a year, so I suggest we defer performance (and probably even duplication) considerations until after the capability and tests are cemented in.
  3. Please help me identify TPR testing scenarios that are missing, and ideally suggest the TPR file(s) I should use to test them.
  4. I'll probably deal with CHANGELOG and docs changes when it looks like we're close to ready to merge, but I still appreciate suggestions on i.e., locations where doc changes will be needed.

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

Developers certificate of origin


📚 Documentation preview 📚: https://mdanalysis--4873.org.readthedocs.build/en/4873/

@pep8speaks
Copy link

pep8speaks commented Dec 29, 2024

Hello @tylerjereddy! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 59:80: E501 line too long (89 > 79 characters)
Line 61:80: E501 line too long (93 > 79 characters)
Line 75:80: E501 line too long (81 > 79 characters)

Line 15:80: E501 line too long (119 > 79 characters)
Line 16:23: E261 at least two spaces before inline comment
Line 22:5: E124 closing bracket does not match visual indentation
Line 24:23: E261 at least two spaces before inline comment
Line 30:5: E124 closing bracket does not match visual indentation
Line 32:16: E261 at least two spaces before inline comment
Line 33:19: E241 multiple spaces after ','
Line 33:33: E241 multiple spaces after ','
Line 34:20: E241 multiple spaces after ','
Line 34:34: E241 multiple spaces after ','
Line 38:5: E124 closing bracket does not match visual indentation
Line 40:21: E261 at least two spaces before inline comment
Line 41:19: E241 multiple spaces after ','
Line 41:33: E241 multiple spaces after ','
Line 42:19: E241 multiple spaces after ','
Line 42:33: E241 multiple spaces after ','
Line 44:20: E241 multiple spaces after ','
Line 46:5: E124 closing bracket does not match visual indentation
Line 47:14: E261 at least two spaces before inline comment
Line 48:19: E241 multiple spaces after ','
Line 48:33: E241 multiple spaces after ','
Line 49:20: E241 multiple spaces after ','
Line 49:34: E241 multiple spaces after ','
Line 53:5: E124 closing bracket does not match visual indentation
Line 54:14: E261 at least two spaces before inline comment
Line 55:19: E241 multiple spaces after ','
Line 55:33: E241 multiple spaces after ','
Line 56:20: E241 multiple spaces after ','
Line 56:34: E241 multiple spaces after ','
Line 60:5: E124 closing bracket does not match visual indentation

Comment last updated at 2024-12-31 17:39:01 UTC

@@ -0,0 +1,73 @@
# -*- Mode: python; tab-width: 4; indent-tabs-mode:nil; coding:utf-8 -*-
# vim: tabstop=4 expandtab shiftwidth=4 softtabstop=4
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

header needs expanding to full version

im_excl_grp_size = data.unpack_int()
ndo_int(data, im_excl_grp_size)
# TODO: why is this needed?
data.unpack_int()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above can probably be cleaned up a bit.. probably has some unused vars, etc.

It was produced by careful printf-ing the GMX source as you might expect. Expanding to support other tpx versions may not be too bad, although this was fairly time consuming to draft.

# api/legacy/include/gromacs/topology/topology_enums.h
# worst case scenario we hard code it based on
# tpx/GMX version?
SimulationAtomGroupType_size = 10
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On latest GMX main branch, this is the data structure in question:

enum class SimulationAtomGroupType : int
{
    TemperatureCoupling,
    EnergyOutput,
    Acceleration,
    Freeze,
    User1,
    User2,
    MassCenterVelocityRemoval,
    CompressedPositionOutput,
    OrientationRestraintsFit,
    QuantumMechanics,
    Count
};

u = mda.Universe(tpr_file)
assert_allclose(u.atoms.positions[0, ...], exp_first_atom)
assert_allclose(u.atoms.positions[-1, ...], exp_last_atom)
assert_equal(u.atoms.positions.shape, exp_shape)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have specific conventions for coordinate reader testing, but for now this is where I've started. Should be easy to expand to include velocities as well.

Cases with only positions and no velocities (etc.) may also be sensible to add, on top of older tpx files...

Copy link

codecov bot commented Dec 29, 2024

Codecov Report

Attention: Patch coverage is 81.72043% with 17 lines in your changes missing coverage. Please review.

Project coverage is 93.57%. Comparing base (af9848b) to head (a6ed0eb).

Files with missing lines Patch % Lines
package/MDAnalysis/coordinates/TPR.py 65.21% 10 Missing and 6 partials ⚠️
package/MDAnalysis/topology/tpr/utils.py 97.82% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4873      +/-   ##
===========================================
- Coverage    93.62%   93.57%   -0.06%     
===========================================
  Files          177      178       +1     
  Lines        21995    22087      +92     
  Branches      3112     3141      +29     
===========================================
+ Hits         20593    20668      +75     
- Misses         947      957      +10     
- Partials       455      462       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@RMeli
Copy link
Member

RMeli commented Dec 29, 2024

@tylerjereddy since this PR is composed of two new files, and the modifications are all in one place, would you mind if I go ahead with #4849? I don't think there would be significant conflicts.

@tylerjereddy
Copy link
Member Author

ok

@RMeli
Copy link
Member

RMeli commented Dec 29, 2024

Thanks @tylerjereddy. It is now merged.

@tylerjereddy
Copy link
Member Author

tylerjereddy commented Dec 30, 2024

I drafted in support/testing for handling velocities as well, plus support/testing for reading positions/velocities from one more tpx version (133). I'm trying to skip the CI when I push in for the next little bit (looks like Azure still ran though..).

Need to poke around the .tpr test files to see if there are some more interesting ones as well re: non-zero velocities, etc.

@tylerjereddy tylerjereddy changed the title WIP, ENH: add TPR position read support WIP, ENH: add TPR position and velocity read support Dec 31, 2024
@tylerjereddy
Copy link
Member Author

tylerjereddy commented Dec 31, 2024

I was able to extend support and testing for .tpr position/velocity reads back as far as GMX 2023 (tpx version 129); GMX 2022rc1 (tpx 127) will fail testing because it needs more binary file striding version checks that I'll need to study in the C++ code (not too hard, but time consuming).

I also modernized an old TPR file we had with non-zero velocities (and a large number of atoms) for cobrotoxin via gmx convert-tpr and then added a test case for this. It passes for positions, but not for velocities yet, so that was a good intuition to check non-zero velocity/larger system case. Since that was converted to tpx 134 it will be a little harder to sort out the problem there, likely requiring side-by-side printf-vs-print for GMX vs. us on the binary striding.

* Add support for reading positions from
GMX `.tpr` files at `tpx` version `134`.
* Expand `TPRReader()` support to include velocity handling,
and add tests/functionality for an additional tpx version (`133`).

[ci skip] [skip azp]
* Add `.tpr` position/velocity reading support back
to GMX 2023 (tpx version 129), and associated testing.

* Add a `.tpr` position/velocity reading test
case that has non-zero velocities and many more
atoms. This test case currently passes for position
retrieval but not for velocity retrieval.

[ci skip] [skip azp] [azp skip] [skip ci]
* Fixed an issue where `TPRReader` used `ts._velocities`
instead of `ts.velocities` to assign the velocity array

* Needed to improve the precision of the expected velocity
values in `test_basic_read_tpr` after the above fix.

* `TPR_xvf_2024_4` was missing from `__all__`, causing
a test failure
@tylerjereddy
Copy link
Member Author

I also modernized an old TPR file we had with non-zero velocities (and a large number of atoms) for cobrotoxin via gmx convert-tpr and then added a test case for this

This test case for non-zero velocities should be passing now, as should other tests, apart from the linter, after some more bug fixes.

The outstanding items here would likely be:

  1. going back to support a wider range of tpx versions
  2. probably a few more tpr files with non-zero velocities across generations should be tested

* Add preliminary support for tpx version 137 (GROMACS 2025.0). This
required additional bit unpacking based on empirical observations
on the `TPR_NNPOT_2025_0` test file. A regression test for position
and velocity reads from that test file was added, with expected
values from `gmx dump -s ala_nnpot_gmx_2025_0.tpr`.

* Fix up formatting complaints from `black` that were causing CI
failures.
@tylerjereddy
Copy link
Member Author

Updates from work today:

  • Added support for reading positions and velocities from tpx version 137 (GROMACS 2025.0), along with a regression test confirming position and velocity reads from a .tpr file from that version of GMX (expected values from gmx dump as usual)
  • finally fixed up the black linter complaints

* Add support and testing for `tpx` version 127 (GROMACS 2022),
including two regression tests for that `.tpr` format. Supporting
this file version required careful analysis of relative binary
file offsets.
@tylerjereddy
Copy link
Member Author

Added support for tpx version 127 today (GROMACS 2022)--two regression tests with different numbers of atoms were used in this case because I had to do manual binary offset analysis vs. newer versions of the format to get the positions properly extracted.

Next up from our support table should probably be tpx 122 (GROMACS 2021), maybe tomorrow.

* Add support for velocity and coordinate reading from `.tpr`
files that are in tpx generation `122` (GROMACS 2021). Includes
two regression tests.
@tylerjereddy
Copy link
Member Author

For today, I added support for tpx version 122 (GROMACS 2021) + two regression tests. Luckily, this was easier than the last two versions I added, no byte skipping shims needed.

For tomorrow, I'll see if I can add support for tpx version 119 (GROMACS 2020).

* Add support for position and velocity data reading from `.tpr` files
that correspond to `tpx` version 119 (GROMACS `2020`). Two regression
tests for `.tpr` files at this version of GMX were added.
@tylerjereddy
Copy link
Member Author

tylerjereddy commented Apr 22, 2025

Today, added support for tpx version 119 (GROMACS 2020), which required minor byte striding adjustments to pass the usual two regression tests.

For tomorrow, I'll aim to extend backward to tpx 116 (GROMACS 2019).

* Add support for reading positions and velocities from
`.tpr` files at `tpx` version 116 (GROMACS `2019`). Likewise
for `tpx` version 112 (GROMACS `2018`).

* Add regression tests for the above improvements.
@tylerjereddy
Copy link
Member Author

Today, support was added for tpx 116 (GMX 2019) and tpx 112 (GMX 2018). The latter required some non-trivial byte-stride analysis on the binary files.

For tomorrow, I'll take aim at tpx 110 (GMX 2016) if I can.

* Add support for position and velocity reading from `.tpr`
files that correspond to `tpx` version 110 (GROMACS `2016`).
Likewise for `tpx` version 103 (GROMACS `5.1`).

* Add regression tests for the above improvements.
@tylerjereddy
Copy link
Member Author

Today, added support for tpx 110 (GROMACS 2016) and tpx 103 (GROMACS 5.1).

I'll try to check tpx 100 (GMX 5.x before 5.1) tomorrow.

* Add position and velocity reading support for `.tpr`
files at `tpx` version 100 (for GROMACS `5.x` before `5.1`),
including regression tests.
@tylerjereddy
Copy link
Member Author

Today, added support for tpx version 100 (GMX 5.x before 5.1). No byte striding shims were needed.

Next up is tpx version 83 (GMX 4.6.x)--I did a quick check and that will require more analysis/byte striding.

* Add support for reading positions and velocities from `.tpr`
files at `tpx` version 83 (GMX `4.6.0`, `4.6.1`). Likewise
for `tpx` versions 73 (GMX `4.5.x`) and 58 (GMX `4.0.x`).
Also added some shims for improved double precision support.

* Add regression tests for the above improvements.
@tylerjereddy tylerjereddy changed the title WIP, ENH: add TPR position and velocity read support ENH: add TPR position and velocity read support Apr 27, 2025
@@ -74,6 +74,8 @@
Impropers,
)

global_precision = 4
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestions for tracking TPR file precision in a way that doesn't use a global are appreciated

@tylerjereddy
Copy link
Member Author

Ok, this weekend I added position and velocity reading support for all remaining tpx versions that MDAnalysis can read topology from under normal circumstances.

I think this is ready for review now--I've updated the original comment above with some notes for reviewers. The linter will currently fail because I'm using a global construct to track TPR precision.

Suggestions for fixing that and anything else you find are appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make TPRParser also read coordinates
3 participants