Skip to content

Conversation

@astro-YYH
Copy link
Collaborator

  1. porting treewalk to cuda without any other optimizations (like query sorting);
  2. it compiles but not runs at present, because not all of the functions called from within the global function have been ported to cuda (mainly those the tw structure points to) yet (only device functions can be called in a global function);

@sbird sbird changed the base branch from treewalk_cuda to master October 15, 2024 19:56
/* Tree freed in PM*/
ForceTree Tree = {0};
// ForceTree Tree = {0};
ForceTree * Tree_ptr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this stack-allocated again, but pass it by value (and probably const).

priv.Ti_Current = Ti_Current;
priv.Accel = AccelStore;
TreeWalk *tw;
cudaMallocManaged(&tw, sizeof(TreeWalk)); // Allocate TreeWalk structure with Unified Memory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one also probably stack-allocated, passed by value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But let me take care of this after it is merged.


tw->ev_label = "GRAVTREE";
cudaMallocManaged(&tw->ev_label, sizeof(char) * strlen("GRAVTREE") + 1); // Allocate memory for ev_label
strcpy(tw->ev_label, "GRAVTREE");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh no! We can't have static strings on CUDA memory. Can we make it a std::string instead?


/* A pointer to the force tree structure to walk.*/
const ForceTree * tree;
ForceTree * tree;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check if these need to be non-const?


/* name of the evaluator (used in printing messages) */
const char * ev_label;
char * ev_label;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this should be "const char" because we cannot convert string literals to char * in C++. Really it should be a const std::string().

// ForceTree Tree = {0};
ForceTree * Tree_ptr;
cudaMallocManaged(&Tree_ptr, sizeof(ForceTree));
cudaDeviceSynchronize();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check whether this DeviceSynchronize is needed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants