Skip to content

Conversation

@TomAugspurger
Copy link
Contributor

No description provided.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Nov 13, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@TomAugspurger
Copy link
Contributor Author

/ok to test ab18e77

@TomAugspurger
Copy link
Contributor Author

/ok to test 46b1d7b

@pentschev
Copy link
Member

Is there a benefit in having this always turned on? It might be useful to look at #612 since this can be very verbose and thus difficult to handle in practice for regular builds.

@TomAugspurger
Copy link
Contributor Author

/ok to test 105c60a

@TomAugspurger
Copy link
Contributor Author

/ok to test 17bc5e6

@TomAugspurger
Copy link
Contributor Author

/ok to test 1fa6250

@nirandaperera nirandaperera added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Nov 17, 2025
@nirandaperera
Copy link
Contributor

/ok to test

@nirandaperera
Copy link
Contributor

I tested out the HostBuffer with @TomAugspurger's reproducer.

image

Left is with HostBuffer and right is with std::vector.
Now, Buffer::allocate time is insignificant. However, now cudaMemcpyAsync is taking up almost all the gains. The total time for each insert is more or less the same. 😕
Does cudaMemcpyAsync initialize memory before copying?

@TomAugspurger
Copy link
Contributor Author

/ok to test 8fd008e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants