-
Notifications
You must be signed in to change notification settings - Fork 424
Add Parsing Code For Scatter-Gather Patterns #3212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
52caa36
to
827e72b
Compare
To make sure I didn't accidentally break anything else, checked the generated RRGraphs for Titan and Titan-S10 and the diffs were identical other than the very first line with the tool_version attribute. Here's the diff for Titan for example:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Various comments embedded.
--------------------- | ||
|
||
The content under the ``<scatter_gather_list>`` tag consists of one or more ``<sg_pattern>`` tags that are used to specify a scatter-gather pattern. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Give high-level context on why we need this (specify multi-level switch patterns, rather than the direct wire-to-wire switch patterns of conventional switch blocks). These additional switches, wires and/or muxes will be added to the architecture, augmenting wires created using segment specifications and swiches created using switch box specifications.
The number of any additional wires or muxes created by scatter-gather specifications will not vary with routing channel width.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High-level spec needs a picture showing the gather pattern, the scatter-gather node, and the scatter pattern. Link that to the high-level description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a figure giving a visual overview of scatter-gather patterns.
const pugiutil::loc_data& loc_data) { | ||
std::vector<t_sg_location> sg_location_list; | ||
for (pugi::xml_node node : sg_pattern_tag.children()) { | ||
if (strcmp(node.name(), "sg_location") != 0) continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be an error message instead of a skip?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or an assert if schema checking would have already caught this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop is actually going through all children of the <sg_pattern> tag that might be gather, scatter or sg_link_list.
<clock buffer_size="auto" C_wire="2.5e-10"/> | ||
</clocks> | ||
|
||
<scatter_gather_list> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment what this is doing.
This is a 2D architecture, so this 3D pattern should eventually be moved to a 3D arch (or maybe moved now?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of this architecture file is simply testing the parsing code to make sure it doesn't crash and is not intended for usage. I think this makes sense since right now since there's no other functionality to test and parsing is all we can do. After the RR-Graph changes are added I will make a proper 3D architecture instead of this.
Added comments to the file.
e06071b
to
6e659aa
Compare
char_prop = get_attribute(node, "to_switchpoint", loc_data, from_to_required).value(); | ||
if (!can_skip_from_to) { | ||
parse_comma_separated_wire_points(char_prop, wc.to_switchpoint_set); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More error checking is needed. For example, for scatter-gather patterns, a combination of from
and to
in <wireconn
> is illegal. I think the caller of this function should specify what fields are allowed in <wireconn>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This checking is done in the "process_sg_tag" function. I don't think it makes sense for the wireconn tag to legalize itself based on what's around it, the parent tag should be the one doing the checking.
// Process scatter-gather patterns (optional) | ||
Next = get_single_child(architecture, "scatter_gather_list", loc_data, pugiutil::OPTIONAL); | ||
if (Next) { | ||
process_sg_tag(Next, arch, loc_data, arch->switches); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why arch->switches
is needed for parsing sg tag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parsing wireconn requires it for name of the from_type etc. Technically process_sg_tag could get arch->switches from arch itself (Since arch is also passed to it) but doing it that way means that the user of process_sg_tag must take care to call the function after switches are parsed. By adding it as an argument the compiler enforces that.
* @param switches Contains all the architecture switches. Usually same as arch->switches. | ||
*/ | ||
void process_sg_tag(pugi::xml_node sg_list_tag, | ||
t_arch* arch, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't pass arch by pointer. We shouldn't follow C-style coding used in libarchfpga.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arch is a pointer outside of this function too. Changing it means changing a fair amount of libarchfpga, VPR and probably parmys too and I don't think it's in the scope of this PR. I could dereference arch and pass it as a reference but I'm not sure if it makes a lot of sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is looking good so far. I just had some comments on documentation (a figure for this would be really nice) and I had some concerns on error checking in the parser.
* ``CORNER`` – only at the corner switch blocks (both x and y-directed channels terminate here) | ||
* ``FRINGE`` – same as PERIMETER but excludes corners | ||
* ``CORE`` – everywhere but the perimeter | ||
Sets the location on the FPGA where the connections described by this scatter-gather pattern be instantiated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a little confused how this corresponds with "location". For me, location would imply an (x, y) tile position of some sort, not general groups.
Is there any way that we can allow this specification to allow the user to only allow a particular SG pattern to exist at a specified tile coordinate? Perhaps it can be similar to the fixed-layout specification of tiles. It seems a little rigid to assign locations this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This uses the same syntax as the custom switch blocks. There is an XY_SPECIFIED option which offers more control but since it's not documented (yet) I didn't want to add it here.
@@ -2634,6 +2638,104 @@ The full format is documented below. | |||
The 'to' set is all L4 switchpoint 0's. | |||
Note that since different switchpoints are selected from different segment types it is not possible to specify this without using ``<from>`` sub-tags. | |||
|
|||
.. _scatter_gathern_patterns: | |||
|
|||
Scatter-Gather Patterns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recall during the weekly VTR meetings, we discussed putting this in the layout tag. I believe that this tag should be a part of the layout in some way, since the location of these SG patterns do depend on the layout of the device (see my comment on sg_location). This should be updated in the text somewhere in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the way custom switchblocks work. It's also a bit easier to implement since it leaves the layout parsing code untouched.
archfpga_throw(loc_data.filename_c_str(), loc_data.line(node), "Unknown side specified: %s\n", sides_string.c_str()); | ||
} | ||
for (char side_char : sides_string) { | ||
wireconn.sides.push_back(CHAR_SIDE_MAP.at(side_char)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isnt this a little dangerous? What if the user specifies "rrrrrrrrrrrrrrrr... (repeat until out of RAM)"
Or even more realisitically someone wrote "rtlRTL" where they accidentally did caps as well as lower case. This would not catch this case and may cause duplicates to come into the code. I recommend deduping somehow here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I changed the sides variable to a set instead of a vector. I don't think the memory is a problem (what if you pass a 12 terabyte .c file to gcc? Is it a security issue if it crashes?) but It did allow duplicate sides before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think the memory is a problem either, I was just making a joke lol. The duplicates is a problem though since it may be possible that people made a mistake.
|
||
// get to type | ||
char_prop = get_attribute(node, "to_type", loc_data).value(); | ||
parse_comma_separated_wire_types(char_prop, wc.to_switchpoint_set); | ||
char_prop = get_attribute(node, "to_type", loc_data, from_to_required).value(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if someone did not specify either a from_type or a to_type? Is that acceptable? If not, we should throw an error here during parsing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the way I did this is the can_skip_from_or_to argument. By default it is set to false and from/to_type are required attributes, this is used by custom switch blocks and therefore the behavior for those aren't changed at all. For scatter-gather specifications, the error checking is done outside of this function and in process_sg_tag.
Added checking for completely empty fields in process_sg_tag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some final comments. I was only able to get half-way through the PR last time.
*/ | ||
struct t_sg_location { | ||
e_sb_location type; | ||
int num; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may want to add a comment explaining these variables. "num" may be a bit vague in this context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added doxygen to the struct.
@@ -24,13 +30,30 @@ enum e_side : unsigned char { | |||
NUM_3D_SIDES = 6, | |||
}; | |||
|
|||
constexpr std::array<e_side, NUM_2D_SIDES> TOTAL_2D_SIDES = {{TOP, RIGHT, BOTTOM, LEFT}}; //Set of all side orientations | |||
constexpr std::array<const char*, NUM_2D_SIDES> TOTAL_2D_SIDE_STRINGS = {{"TOP", "RIGHT", "BOTTOM", "LEFT"}}; //String versions of side orientations | |||
const std::unordered_map<char, e_side> CHAR_SIDE_MAP = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although creating this unordered map is very clever to do this lookup, since this is a const instead of a constexpr this can actually be quite slow to lookup into since the compiler will likely turn this into a global variable in the memory.
I recommend just creating a simple function here which converts a char into the correct "e_side" using a switch statement. This function is likely to be inlined, so it should be performant, and its less prone to errors, as people may index this global map in the wrong way.
Something like:
inline e_side char_to_e_side(char c) {
switch (c) {
case 'T':
case 't':
return e_side::TOP;
case 'R':
case 'r':
return e_side::RIGHT;
...
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually made a function for this during development but Soheil convinced me to use a map here. I don't think this code matters that much to try and hyper optimize it and we're already doing this approach in a lot of other places in VTR (especially since switch cases can only be used because the input type is char and not string). The proper way to do this would be using compile-time reflection but that doesn't exist in C++ yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't do string lookups in performance-critical code, so I don't think it matters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me its not the performance I care about, it is the unnecessary global variable. Generally its good to avoid global variables. Just a small nitpick since doing this too much can cause bloat to the program for no reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the lack of safety from a user giving an invalid character can be a bit of a problem. When the input is a custom enum its not as big of an issue, but the set of all possible chars is quite large.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is worrisome. Linker errors where you get two global variables with the same name are very nasty. I would get rid of this then, moving the definition of the global to a single .cpp file, or otherwise changing the implementation to avoid any potential issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found a stack overflow discussion on this topic:
https://stackoverflow.com/questions/12042549/define-constant-variables-in-c-header
It looks like there is some C++ standard magic going on that allows this. According to the post "const variables in a namespace scope has internal linkage". This basically leads to each include getting their own private copy of this variable in this case. This is just confusing in my opinion (I did not even know this fact) and it is a bit wasteful.
I think if we do not create a helper function, we should at least declare this variable as inline
. This explicitly ensures static linkage without relying on the compiler to infer it. Inline also has a fun property where (I think) it will only allocate one copy in the executable for all usages; but this I am not sure of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"const" is a compiler directive which the compiler does not need to follow.
If a compiler ignores the const keyword it would not be compliant to the C++ standard. A compliant compiler can't just ignore things that aren't implementation defined or undefined and const is neither of these.
I think if we do not create a helper function, we should at least declare this variable as inline. This explicitly ensures static linkage without relying on the compiler to infer it. Inline also has a fun property where (I think) it will only allocate one copy in the executable for all usages; but this I am not sure of.
Agreed. I think using inline gives us the best of all worlds here. I'll add that to the maps' definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wanted to close this thread on my comment on "const". When I say that the compiler does not have to follow it, what I mean is the following. Lets say we made the following code:
const int x = 1;
// NOTE: x = x + 1 would yield a semantic error
int y = x + 1;
return y;
Since x is a const, the user cannot assign another value to it; however, the compiler is allowed to optimize this code into something functionally equivalent to the following:
int x = 1;
x += 1;
return x;
Where x is some register. Notice that the compiler is not treating x like a const anymore. In compilers, we actually throw out the "const" keyword after semantic checking (essentially, sometimes its kept around). The "constness" of the variable is guaranteed by the functional correctness of the compilation process. Hence, this is what I mean by "const" being a compiler directive that the compiler does not need to follow.
When you defined this map as a const global variable, you are only ensuring that the user never modifies it; you are not ensuring that the compiler will not, so the compiler may have to create multiple copies of this variable to maintain functional correctness (I think). This is one of those annoying compiler things that may be implementation dependent. We should avoid this pattern the best that we can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made the map variables inline. Inline variables fix the duplicate copies issue (which, to emphasize, was strictly a "the binary might be 2 kilobytes larger at worst" issue and nothing functionally broken or slow) and was added in C++17 to help with this specific usecase.
The reason I so strongly advocate for using maps instead of functions is:
- The code for any function we would write for this would be a bad implementation of a map anyway (O(n) linear search instead of O(1) lookup. not that performance matters much here.)
- Data is easier to understand than code, and the map implementation is pretty much purely data (from a programmer's perspective)
- Most importantly, the map is defined right next to the enum definition. Anyone changing the enum in the future would see the map right underneath the enum and would update it. In case they don't, it's much easier to catch during review. Contrast this to a function definition which is much more likely to be forgotten to be updated.
@@ -40,6 +63,13 @@ enum class e_sb_location { | |||
E_XY_SPECIFIED | |||
}; | |||
|
|||
const std::unordered_map<std::string, e_sb_location> SB_LOCATION_STRING_MAP = {{"EVERYWHERE", e_sb_location::E_EVERYWHERE}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here as e_side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah since this is not char this would be a mess of a function with a thousand if/elses and comparing strings vs hashing strings isn't very different performance wise.
<clock buffer_size="auto" C_wire="2.5e-10"/> | ||
</clocks> | ||
|
||
<!-- Specify a scatter-gather pattern for 3D connections. However, do that this is not a 3D architecture and this file is only used to test the parsing logic for scatter-gather patterns. --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused what this comment trying to say?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do note*
What I mean is that the specification is nonsense and the purpose is just testing the parsing code (in the strong tests). I'll add a proper 3D architecture when the RR-Graph changes are implemented.
</sg_link_list> | ||
|
||
<!-- Instantiate 10 'L_UP' sg_links per switchblock location everywhere on the device --> | ||
<sg_location type="EVERYWHERE" num="10" sg_link_name="L_UP"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still stand by that this would be better suited in the layout somehow and not in the "scatter_gather_list".
vtr_flow/tasks/regression_tests/vtr_reg_strong/strong_scatter_gather/config/config.txt
Outdated
Show resolved
Hide resolved
b8c265e
to
44d1078
Compare
Requested changes are mostly done, I guess the only thing I didn't do is make a tutorial/example architecture. Should we do it for this PR or when the RR-Graph changes are done? I personally would like to defer it for the RR-Graph PR but if you think it's necessary I'll do it now. |
We can defer the tutorial until after the implementation is done and you have a sample architecture or two, which will likely form the basis for the tutorial. |
Using inline variables make sure the data in the map is not duplicated when including the header file in multiple headers. This C++17 feature was explicitly added to facilitate this usecase (defining const variables in header files.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Adds xml parsing code for scatter-gather patterns, with a sample architecture (That doesn't quite make sense) and test to make sure the code actually runs.
Note that scatter-gather patterns are not implemented in VPR and adding the tag does nothing other than parsing the information in the architecture file.
Information about what scatter-gather patterns and what the tag means is available in the documentation for this PR.