[RcclReplayer] JSON <-> BIN log format conversion tool #2056
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Details
Do not mention proprietary info or link to internal work items in this PR.
Work item: LWPCLPAT-620, LWPCLPAT-624
What were the changes?
Added
replay_log_converter.pyto convert replay logs between BIN and JSON formats, and to generate standardized JSON output which can be parsed by standard JSON libraries, and sanitize JSON logs for easier comparison.Why were the changes made?
Currently JSON allows users to open the file and read what's going on, however Replayer currently only works with .BIN input.
Additionally, comparing logs from different test runs is difficult due to variable pointer addresses and timestamps.
This tool will help users to convert the generated BIN logs to JSON format (for the same run) which can be viewed and analyzed, and normalize logs for comparison.
How was the outcome achieved?
The tool can convert between the two formats using the commands:
Binary to JSON:
python3 replay_log_converter.py <basename> tojsonJSON to Binary:
python3 replay_log_converter.py <basename> tobinStandardize JSON:
python3 replay_log_converter.py <basename> --standardizeSanitize JSON:
python3 replay_log_converter.py <basename> --sanitizeSanitize JSON (No Timestamp):
python3 replay_log_converter.py <basename> --sanitize --no-timestamp(or--nts) sets all timestamps to 0.0Additional Documentation:
Since the JSON logs generated by recorder do not record
GroupStartandGroupEndcalls as of now, converting JSON logs to BIN format and executing RcclReplayer against them may not work as expected.The
--sanitizeoption normalizes logs for easier comparison by:* Remapping pointers to readable identifiers (e.g.,
comm : 0x7fb680328010→comm : comm_001)* Normalizing timestamps relative to the first call (e.g.,
time : 1762969171532.248535→time : 0.000000)* Preserving relationships: same pointer values get the same sanitized identifier
* Sanitized fields: communicators (
comm), unique IDs (uniqueID), streams (stream), buffer addresses (addr/base/ptr/acc), handles (handle), thread IDs (thread), and process IDs (pid)Approval Checklist
Do not approve until these items are satisfied.