-
Notifications
You must be signed in to change notification settings - Fork 419
Move call location of ClockRRGraphBuilder and use alloc_and_load_edges. #1081
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2c51ae7
to
8f90f26
Compare
CI is now green, @mustafabbas and @kmurray please review. |
If I understand correctly the main structural changes in this code are as follows:
Assuming my understanding of the above is correct this looks like a good change to me. @litghost What's the best approach for handling this? Since this is a less used code path I'm pretty happy to merge this independently of #1096 (which would also simplifies #1096). Also this PR seems to have a merge conflict which would also need to be resolved. |
I'll rebase on master, and we should merge this. |
Yes |
By moving ClockRRGraphBuilder earlier in the rr graph flow, several parts of ClockRRGraphBuilder::create_and_append_clock_rr_graph can be avoided, as they were duplicating work that the original build_rr_graph flow was already doing (init_fan, mapping arch switch to rr switch, partition_edges). This new code should also fully preallocate the rr_node array, though this is not required by the code. Signed-off-by: Keith Rothman <[email protected]>
Signed-off-by: Keith Rothman <[email protected]>
Signed-off-by: Keith Rothman <[email protected]>
Signed-off-by: Keith Rothman <[email protected]>
Signed-off-by: Keith Rothman <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes, very minor comments to look at while you rebase
vpr/src/route/check_rr_graph.cpp
Outdated
if (device_ctx.virtual_clock_network_root_idx == inode) { | ||
return; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should move this outside the function. Perhaps before the call to check_rr_node
in check_rr_graph
with the other ignored nets here:
/* Ignore any uninitialized rr_graph nodes */ |
The reason is check_rr_node
is called twice in check_rr_graph
and in check_route
. When we call this function in check_route
we really don't want to see the virtual clock sink, as we decided its best to remove it from the route tree after we utilize it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -39,18 +39,24 @@ void RoutingToClockConnection::set_fc_val(float fc_val) { | |||
* RoutingToClockConnection (member functions) | |||
*/ | |||
|
|||
void RoutingToClockConnection::create_switches(const ClockRRGraphBuilder& clock_graph) { | |||
size_t RoutingToClockConnection::estimate_additional_nodes() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment to note that routing to clock connections create a virtual sink which is why we are returning one node to be allocated here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
16d5bae
to
7f027cf
Compare
Signed-off-by: Keith Rothman <[email protected]>
Hmm, looks like a QoR change. That should probably be investigated to figure out the cause (in particular since its on the clock regression test). That test is really an implicit check of the clock modelling functionality, so a QoR change could indicate that it broke somehow. @mustafabbas How would you recommend verifying that that test is working correctly (i.e. generated clock structure is correct, and used correctly)? |
I've noticed it too. Weirdly enough prior to the rebase onto master, the QoR check passed. I'm investigating now. It is possible (but unclear how) that something in this PR has resulted in the QoR change I was describing in #1084 (comment) . I haven't been able to track it down yet. Data I have so far:
Investigation continues. |
One concrete that is worth pointing out is that the previous code failed check_rr_graph because of missing non-configurable nodes in the reverse direction. The code here: https://github.com/verilog-to-routing/vtr-verilog-to-routing/pull/1081/files#diff-502e2913e417644b10bae07da14b9dabR191-R195 fixes this bug. So it is possible that the QoR change is simply a result of fixing a bug. I have tried backing out just that change, and the QoR change persists, so I'm still investigating. |
I used to trace the clock structure in the dumped rr_graph.xml. This is a qor in final setup time though. I am not sure off the top of my head where to look but I can replicate and take a look at it tonight if the bug still lingers. |
I've begun to isolate the changes in a separate branch. So far no bugs have been found :/ |
I believe I've isolated the change in QoR to fan / switch behavior, and I believe that the previous QoR was simply incorrect based on incorrect arch -> rr switch behavior. @mustafabbas if you can check that the code is still working as expect, I believe this PR should be merged, along with an update to the QoR for the dedicated networks. @kmurray How do you want updates to the strong QoR delivered? |
Specifically there were two bugs in the previous code that are now fixed:
Both of the bugs above are avoided by virtue of doing 1 arch -> rr pass using the old rr graph generator logic. |
Agreed accounting for fan in information should make a difference in QoR. I also tested it locally and the written rr_graph looks correct. Thanks for the fixes. I believe this pull request is good to go after updating the golden results |
QoR change was do to fixing bugs related to switch fan-in delay fixes related to the clock network (see discussion in verilog-to-routing#1081).
I've re-generated them and pushed the updates. Once CI passes this should be good to merge. |
Awesome, thanks |
Description
By moving ClockRRGraphBuilder earlier in the rr graph flow, several
parts of ClockRRGraphBuilder::create_and_append_clock_rr_graph can be
avoided, as they were duplicating work that the original build_rr_graph
flow was already doing (init_fan, mapping arch switch to rr switch,
partition_edges).
This new code should also fully preallocate the rr_node array, though
this is not required by the code.
Fixes #1080
Related Issue
#1079
#1080
Motivation and Context
When researching #1079, it was discovered that there was basically 3 ways edges were used (
alloc_and_load_edges
, rr graph load, andClockRRGraphBuilder
). However it is clear thatClockRRGraphBuilder::create_and_append_clock_rr_graph
should just be usingalloc_and_load_edges
rather than a bespoke allocation methodology.In addition,
ClockRRGraphBuilder
was duplicating work (two calls toinit_fan_out
for example), and it was possible forClockRRGraphBuilder
to emit a switch that still had an arch_switch_idx vs an rr_switch_idx.This PR fixes all of these issues.
How Has This Been Tested?
Types of changes
Checklist: