-
Notifications
You must be signed in to change notification settings - Fork 306
WIP: New JSON format and tools #7556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
yfguo
wants to merge
11
commits into
pmodels:main
Choose a base branch
from
yfguo:new-json
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Creating CSEL constants array for the string name of collective and comm hierarchy. These string values will be used during parsing of the JSON file, and printing of the CSEL tree node. Separating the implementation details CSEL tree printing function for the ease of maintenance.
Consolidate the POSIX coll algorithm enum definition under MPII. The JSON parsing no longer need separate functions for them.
Consolidate the CH4 coll algorithm enum definition under MPII. The JSON parsing no longer need separate functions for them.
Consolidate the OFI coll algorithm enum definition under MPII. The JSON parsing no longer need separate functions for them.
Creating internal function for creating, freeing and updating CSEL tree node. They are used to manipulating tree struction in future CSEL optimizations.
MPIR_CVAR_COLLECTIVE_SELECTION_REPORT controls how MPICH show the collective selection logic during init. It is turned off by default. The user can choose to print the CSEL in tree format or summary format (later commit).
The ANY node is needed in CSEL as the catch-all condition logically. Keeping the actual ANY node in the tree add additional pointer deference with no benefits. This commit will squash all ANY nodes during CSEL tree inititalization.
The LT conditions are internally converted to less-than-or-equal (LE). The LT conditions are redundant since it can always be represented in LE conditions. Having both of them complicates the logical and can create holes in matching ranges. It is deprecated in this PR, and only kept for backward compatibility.
Merging multple LE condition nodes into a single node for matching. LE conditions are stored in a sorted range-set which enables binary search when matches a value against these conditions.
Using versioned JSON with new node structure for the generic.json. Other files are not converted as they will be removed in future.
Current JSON for Bcast: {
"collective=bcast":
{
"comm_type=intra":
{
"comm_size<8":
{
"comm_hierarchy=parent":
{
"avg_msg_size<=0":
{
"algorithm=MPIR_Bcast_intra_smp":{}
},
"avg_msg_size=any":
{
"algorithm=MPIR_Bcast_intra_binomial":{}
}
},
"comm_hierarchy=any":
{
"avg_msg_size=any":
{
"algorithm=MPIR_Bcast_intra_binomial":{}
}
}
},
"comm_size=pow2":
{
"comm_hierarchy=parent":
{
"avg_msg_size<=0":
{
"algorithm=MPIR_Bcast_intra_smp":{}
},
"avg_msg_size<=12288":
{
"algorithm=MPIR_Bcast_intra_binomial":{}
},
"avg_msg_size<=524288":
{
"algorithm=MPIR_Bcast_intra_scatter_recursive_doubling_allgather":{}
},
"avg_msg_size=any":
{
"algorithm=MPIR_Bcast_intra_scatter_ring_allgather":{}
}
},
"comm_hierarchy=any":
{
"avg_msg_size<=12288":
{
"algorithm=MPIR_Bcast_intra_binomial":{}
},
"avg_msg_size<=524288":
{
"algorithm=MPIR_Bcast_intra_scatter_recursive_doubling_allgather":{}
},
"avg_msg_size=any":
{
"algorithm=MPIR_Bcast_intra_scatter_ring_allgather":{}
}
}
},
"comm_size=any":
{
"comm_hierarchy=parent":
{
"avg_msg_size<=0":
{
"algorithm=MPIR_Bcast_intra_smp":{}
},
"avg_msg_size<=12288":
{
"algorithm=MPIR_Bcast_intra_binomial":{}
},
"avg_msg_size=any":
{
"algorithm=MPIR_Bcast_intra_scatter_ring_allgather":{}
}
},
"comm_hierarchy=any":
{
"avg_msg_size<=12288":
{
"algorithm=MPIR_Bcast_intra_binomial":{}
},
"avg_msg_size=any":
{
"algorithm=MPIR_Bcast_intra_scatter_ring_allgather":{}
}
}
}
},
"comm_type=inter":
{
"algorithm=MPIR_Bcast_inter_remote_send_local_bcast":{}
}
},
} {
"version": "5.0",
"bcast": {
"intra-comm": {
"true": {
"comm_size": {
"7": {
"parent-comm": {
"true": {
"avg_msg_size": {
"0": {
"algorithm": {
"name": "MPIR_Bcast_intra_smp"
}
},
"max": {
"algorithm": {
"name": "MPIR_Bcast_intra_binomial"
}
}
}
},
"false": {
"avg_msg_size": {
"max": {
"algorithm": {
"name": "MPIR_Bcast_intra_binomial"
}
}
}
}
}
},
"max": {
"pof2": {
"true": {
"parent-comm": {
"true": {
"avg_msg_size": {
"0": {
"algorithm": {
"name": "MPIR_Bcast_intra_smp"
}
},
"12288": {
"algorithm": {
"name": "MPIR_Bcast_intra_binomial"
}
},
"524288": {
"algorithm": {
"name": "MPIR_Bcast_intra_scatter_recursive_doubling_allgather"
}
},
"max": {
"algorithm": {
"name": "MPIR_Bcast_intra_scatter_ring_allgather"
}
}
}
},
"false": {
"avg_msg_size": {
"12288": {
"algorithm": {
"name": "MPIR_Bcast_intra_binomial"
}
},
"524288": {
"algorithm": {
"name": "MPIR_Bcast_intra_scatter_recursive_doubling_allgather"
}
},
"max": {
"algorithm": {
"name": "MPIR_Bcast_intra_scatter_ring_allgather"
}
}
}
}
}
},
"false": {
"parent-comm": {
"true": {
"avg_msg_size": {
"0": {
"algorithm": {
"name": "MPIR_Bcast_intra_smp"
}
},
"12288": {
"algorithm": {
"name": "MPIR_Bcast_intra_binomial"
}
},
"max": {
"algorithm": {
"name": "MPIR_Bcast_intra_scatter_ring_allgather"
}
}
}
},
"false": {
"avg_msg_size": {
"12288": {
"algorithm": {
"name": "MPIR_Bcast_intra_binomial"
}
},
"max": {
"algorithm": {
"name": "MPIR_Bcast_intra_scatter_ring_allgather"
}
}
}
}
}
}
}
}
}
},
"false": {
"algorithm": {
"name": "MPIR_Bcast_inter_remote_send_local_bcast"
}
}
}
},
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request Description
JSON format
Creating a new JSON format as part of the collective selection refactoring. The focus on this new format is correctness, completeness and validation.
There are a few rules that the JSON file must met (and will be validated):
The top level object in JSON will be:
The version define the compatibility of the JSON file. Version 5.0 means compatible with MPICH 5.0 and afterward.
The binary conditions in JSON will be like this:
This will cover all the binary checks such as pof2? is_commutative?, is_sendbuf_inplace?, is_intracomm?, is_builtin_op?. None of the branches can be empty.
The range conditions in JSON will be like this:
This will cover checks for comm_size, avg_msg_size, etc. All the thresholds are checked with less-or-equal operation. The "max" branch is required.
The algorithm in JSON will be like this:
It contains the name and the required parameters for an algorithm. The object for composition will be in similar format.
The tree can have the binary condition nodes and range condition nodes organized in any order.
JSON tools
[WIP] There is a new tool for manipulating the JSON file which would help ensuring the JSON file meet all the requirements.
[WIP] There is a script for converting existing JSON files to the new format.
Author Checklist
Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
Commits are self-contained and do not do two things at once.
Commit message is of the form:
module: short description
Commit message explains what's in the commit.
Whitespace checker. Warnings test. Additional tests via comments.
For non-Argonne authors, check contribution agreement.
If necessary, request an explicit comment from your companies PR approval manager.