feat[next-dace]: Use SDFG library node for lowering of broadcast and reduce#2386
feat[next-dace]: Use SDFG library node for lowering of broadcast and reduce#2386edopao wants to merge 76 commits intoGridTools:mainfrom
Conversation
f94a037 to
ef9ef92
Compare
philip-paul-mueller
left a comment
There was a problem hiding this comment.
There are some refinements needed.
|
|
||
|
|
||
| @dace_library.node | ||
| class Fill(dace_nodes.LibraryNode): |
There was a problem hiding this comment.
I would add some more semantic, i.e. an input connector, that collects the value that should be broadcasted and an output connector for the output.
I am also wondering if it would make sense to have two different library nodes.
One where the value that is broadcast is a literal, like 0.0 and one, which is probably the current one, where the value is read from another data descriptor (might be hard to integrate into the lowering).
bc04cfc to
25abd36
Compare
|
@edopao |
|
cscs-ci run default |
|
cscs-ci run default |
|
cscs-ci run default |
|
No plan for now to integrate this feature. |
…Did I even run the tests?
…o_sdfg_primitives.py Co-authored-by: Edoardo Paone <edoardo16@gmail.com>
edopao
left a comment
There was a problem hiding this comment.
Very good, just some minor comments.
| library as dace_library, | ||
| nodes as dace_nodes, | ||
| properties as dace_properties, | ||
| subsets as dace_sbs, |
There was a problem hiding this comment.
In the transformation module, we use the dace_sbs alias, in the lowering module we use dace_subsets. It's OK to use dace_sbs in this module, but let's try to keep it consistent.
| ```python | ||
| for i in range(len(broadcast_in_dim): | ||
| assert output.shape[broadcast_in_dim[i]] == value_to_broadcast.shape[i] | ||
| ``` |
There was a problem hiding this comment.
| ``` | |
| ``` | |
| In other words, the result array shape has the same size as the broadcast domain. |
| ``` | ||
|
|
||
| Args: | ||
| broadcast_in_dim: How to broadcast. |
There was a problem hiding this comment.
| broadcast_in_dim: How to broadcast. | |
| broadcast_in_dim: How to broadcast, see the class documentation. |
|
|
||
| Args: | ||
| broadcast_in_dim: How to broadcast. | ||
| params: The parameters that should be used for the expansion. If given one |
There was a problem hiding this comment.
| params: The parameters that should be used for the expansion. If given one | |
| params: The parameters that should be used for the expansion. If given, one |
|
|
||
| Todo: | ||
| - While for the output it is probably okay to always require an adjacent | ||
| AccessNode for the input it might be possible to be on the other side |
There was a problem hiding this comment.
| AccessNode for the input it might be possible to be on the other side | |
| AccessNode, the input nodes might be outside a map scope. |
However, I don't understand how this could happen.
| # A fundamental requirement is that `bcast_result` is only generated by us. | ||
| # ADR-18 guarantees us this if it is transient and has a single producer, | ||
| # `bcast_node`. However, since we will remove `bcast_result`, we have to | ||
| # make sure that it is not used every where else. |
There was a problem hiding this comment.
| # make sure that it is not used every where else. | |
| # make sure that it is not used anywhere else. |
|
|
||
| match consumer := consumer_edge.dst: | ||
| case dace_nodes.AccessNode(): | ||
| # TODO(phimuell): Are there more checks needed. |
There was a problem hiding this comment.
I suggest removing this todo comment before merge, unless there are known cases.
| # TODO(phimuell): Are there more checks needed. |
| # Check single use data if it was not known at the beginning. | ||
| if self._single_use_data is None: | ||
| find_single_use_data = dace_analysis.FindSingleUseData() | ||
| single_use_data = find_single_use_data.apply_pass(sdfg, None) |
There was a problem hiding this comment.
Would it be wrong to now store single_use_data? I am asking because it is used again inside apply().
| # We need new transformations in order to deal with GTIR library nodes. | ||
| # For now, we simply expand these nodes before starting optimizing. | ||
| # TODO: Remove once transformations are ready. |
There was a problem hiding this comment.
Why not calling ScalarBrodcastInliner before expanding?
| # probably yes, as we can remove the read and write of the initial data | ||
| # only the write to final destination is left. If the consumers are Maps | ||
| # the thing is a bit different. As we have to create the intermediate | ||
| # allocation. If the read of the memory is okay the `InlineBroadcastAccess` |
There was a problem hiding this comment.
InlineBroadcastAccess does not exist yet.
edopao
left a comment
There was a problem hiding this comment.
Very good, just some minor comments.
TODO: