-
Notifications
You must be signed in to change notification settings - Fork 2
Treewalk cuda #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
astro-YYH
commented
Oct 15, 2024
- porting treewalk to cuda without any other optimizations (like query sorting);
- it compiles but not runs at present, because not all of the functions called from within the global function have been ported to cuda (mainly those the tw structure points to) yet (only device functions can be called in a global function);
e2e2fd1 to
7ad04f5
Compare
4b7429a to
6602817
Compare
| /* Tree freed in PM*/ | ||
| ForceTree Tree = {0}; | ||
| // ForceTree Tree = {0}; | ||
| ForceTree * Tree_ptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this stack-allocated again, but pass it by value (and probably const).
| priv.Ti_Current = Ti_Current; | ||
| priv.Accel = AccelStore; | ||
| TreeWalk *tw; | ||
| cudaMallocManaged(&tw, sizeof(TreeWalk)); // Allocate TreeWalk structure with Unified Memory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one also probably stack-allocated, passed by value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But let me take care of this after it is merged.
libgadget/gravshort-tree.c
Outdated
|
|
||
| tw->ev_label = "GRAVTREE"; | ||
| cudaMallocManaged(&tw->ev_label, sizeof(char) * strlen("GRAVTREE") + 1); // Allocate memory for ev_label | ||
| strcpy(tw->ev_label, "GRAVTREE"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh no! We can't have static strings on CUDA memory. Can we make it a std::string instead?
libgadget/treewalk.h
Outdated
|
|
||
| /* A pointer to the force tree structure to walk.*/ | ||
| const ForceTree * tree; | ||
| ForceTree * tree; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check if these need to be non-const?
libgadget/treewalk.h
Outdated
|
|
||
| /* name of the evaluator (used in printing messages) */ | ||
| const char * ev_label; | ||
| char * ev_label; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this should be "const char" because we cannot convert string literals to char * in C++. Really it should be a const std::string().
| // ForceTree Tree = {0}; | ||
| ForceTree * Tree_ptr; | ||
| cudaMallocManaged(&Tree_ptr, sizeof(ForceTree)); | ||
| cudaDeviceSynchronize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check whether this DeviceSynchronize is needed!