-
Notifications
You must be signed in to change notification settings - Fork 82
Add C++ benchmarks (part 2/n) #664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
kingcrimsontianyu
wants to merge
132
commits into
rapidsai:branch-25.06
from
kingcrimsontianyu:bench-2
Closed
Add C++ benchmarks (part 2/n) #664
kingcrimsontianyu
wants to merge
132
commits into
rapidsai:branch-25.06
from
kingcrimsontianyu:bench-2
+9,613
−11,160
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
bf6bd2c to
5dbb605
Compare
a82bbbf to
1dc94f5
Compare
Contributor
Author
Forward-merge branch-25.06 into branch-25.08
Forward-merge branch-25.06 into branch-25.08
resolve forward-merge from branch-25.06 to branch-25.08
Forward-merge branch-25.06 into branch-25.08
`S3Endpoint` takes optional parameters for the AWS region, access key ID, etc. If these aren't set, they're looked up from the environment.
Previously, the only way to specify these from Python was via environment variables. This adds named parameters to `f = kvikio.RemoteFile.open_s3` so that users can specify the credentials programatically. The default behavior is unchanged: environment variables are used when not specified otherwise.
Here's a test snippet against an S3 bucket:
```python
import sys
import boto3
import kvikio
import rmm
bucket, access_key_id, secret_access_key, session_token, default_region = sys.argv[1:]
client = boto3.client(
's3',
aws_access_key_id=access_key_id,
aws_secret_access_key=secret_access_key,
aws_session_token=session_token,
region_name=default_region,
)
key = "test/date-2025-09-16"
client.put_object(Bucket=bucket, Key=key, Body=b'Hello, world!')
client.head_object(Bucket=bucket, Key=key)
buf = rmm.DeviceBuffer(size=13)
f = kvikio.RemoteFile.open_s3(bucket, key, access_key_id=access_key_id, secret_access_key=secret_access_key, session_token=session_token, region_name=default_region)
f.read(buf)
print(buf.tobytes())
```
I've set those variables to `_`-prefixed versions. When run, that prints
```
❯ python debug.py kvikiobench-33622 $_AWS_ACCESS_KEY_ID $_AWS_SECRET_ACCESS_KEY $_AWS_SESSION_TOKEN $_AWS_DEFAULT_REGION
b'Hello, world!'
```
Authors:
- Tom Augspurger (https://github.com/TomAugspurger)
Approvers:
- Tianyu Liu (https://github.com/kingcrimsontianyu)
URL: rapidsai#846
…th issue (rapidsai#844) ## Background ### Initial problem There is currently an unsolved problem in libcurl, which somehow is mislabeled as merged/solved in curl/curl#13754. For AWS S3 that requires credentials, if an object key name contains `=`, libcurl will fail with an HTTP 403 response. This problem does not occur to public S3 objects. This can be reproduced using the `curl` program: ```bash #!/usr/bin/env bash # version: curl 8.15.0-DEV curl_bin=<my_curl_program_loc> # ..........S3 private.......... region=$(aws configure get region) user_password=$(aws configure get aws_access_key_id):$(aws configure get aws_secret_access_key) # curl can handle this. The object key name does not contain = url="https://<private-bucket>.s3.<region>.amazonaws.com/witcher/2MiB.bin" # curl cannot handle this. The object key name contains = url="https://<private-bucket>.s3.<region>.amazonaws.com/witcher/key=value_2MiB.bin" $curl_bin -s $url \ --aws-sigv4 "aws:amz:$region:s3" \ --user "$user_password" \ -o /dev/null -w "%{http_code}\n" -v # ..........S3 public.......... # curl can handle both url="https://<public-bucket>.s3.<region>.amazonaws.com/witcher/2MiB.bin" url="https://<public-bucket>.s3.<region>.amazonaws.com/witcher/key=value_2MiB.bin" $curl_bin -s $url \ -o /dev/null -w "%{http_code}\n" -v ``` ### Additional problem It has been found that beyond `=` alone, other special characters such as `!*'()` in a private S3 object will also cause libcurl error. In addition, some characters such as `+` in a public S3 object will cause the same error. ## This PR This PR addresses this problem by handling special characters listed in the [AWS object key naming guidelines](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html#object-key-guidelines), for both private and public S3 object names. The KvikIO-specific object key naming guidelines are added to the remote file documentation. Specifically, this PR introduces utility classes `UrlBuilder` (to complement the existing `UrlParser`), which builds a URL according to the user-provided components, and `UrlEncoder` which uses a compile-time, percent-encoding lookup table to encode selected characters. Closes rapidsai#823 Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Tom Augspurger (https://github.com/TomAugspurger) - Bradley Dice (https://github.com/bdice) URL: rapidsai#844
…i#848) This small PR adds `aws_` prefix to the parameter `session_token` to make the parameter names more consistent for S3 utility functions. Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) - Tom Augspurger (https://github.com/TomAugspurger) URL: rapidsai#848
Forward-merge branch-25.10 into branch-25.12
Authors: - Paul Taylor (https://github.com/trxcllnt) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#852
…sai#853) This small PR fixes an out-of-bounds memory access that happens when the file open flags consist of a single character (e.g. `"r"` or `"w"` without the `"+"` suffix). Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Vukasin Milovanovic (https://github.com/vuule) URL: rapidsai#853
Supports rollout of new branching strategy. https://docs.rapids.ai/notices/rsn0047/ xref: rapidsai/build-planning#224 Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Jake Awe (https://github.com/AyodeAwe) URL: rapidsai#854
Supports rollout of new branching strategy. https://docs.rapids.ai/notices/rsn0047/ xref: rapidsai/build-planning#224 Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Jake Awe (https://github.com/AyodeAwe) URL: rapidsai#855
Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) - Bradley Dice (https://github.com/bdice) URL: rapidsai#856
…l namespace (rapidsai#851) This PR is part of the effort to minimize transitive includes in KvikIO shared library. It moves the NVTX-related code from the public headers to the `detail` namespace. As a result, the files `parallel_operation.hpp` and `posix_io.hpp` have also been moved to the `detail` namespace. Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) - Lawrence Mitchell (https://github.com/wence-) URL: rapidsai#851
Contributes to rapidsai/build-planning#224 ## Notes for Reviewers This is safe to admin-merge because the change is a no-op... configs on those 2 branches are identical. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Nate Rock (https://github.com/rockhowse) URL: rapidsai#857
Issue: rapidsai/build-infra#297 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Jake Awe (https://github.com/AyodeAwe) - Tianyu Liu (https://github.com/kingcrimsontianyu) - Robert Maynard (https://github.com/robertmaynard) URL: rapidsai#858
) With this PR, KvikIO will support username-based authentication for WebHDFS via environment variable `KVIKIO_WEBHDFS_USERNAME`. Note: `libcudf` uses KvikIO's utility function `open(url)` to infer endpoint type, where currently the access credentials can only be specified via environment variables instead of programmatically as function parameters. We will address this limitation in the future. This PR is breaking in that: - It moves S3 endpoint's utility function `unwrap_or_default` to the detailed namespace, considering that this utility function is supposed to be an implementation detail. - It adds `username` parameter to one of the two WebHDFS endpoint constructors for completeness (the other constructor has already had `username` as its parameter). Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) URL: rapidsai#859
…e configs (rapidsai#862) This uses `RAPIDS_BRANCH` in style checks where we reference rapids-cmake configs for `cmake-format`. xref: rapidsai/build-planning#224 Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Gil Forsyth (https://github.com/gforsyth) URL: rapidsai#862
Ruff does not yet support Cython, so restore isort only for Cython. Issue: rapidsai/build-planning#130 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - https://github.com/jakirkham URL: rapidsai#864
…rapidsai#867) This fixes a conda environment creation command to support both `x86_64` and `aarch64` systems. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Matthew Murray (https://github.com/Matt711) URL: rapidsai#867
This PR disables building benchmarks by default, consistent with other RAPIDS projects such as cuDF and RAFT. It also updates the CI build script to ensure that benchmark builds are still tested in CI. This change helps address the issue in cuDF where KvikIO benchmarks are built unnecessarily. Authors: - Yunsong Wang (https://github.com/PointKernel) - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) - Vyas Ramasubramani (https://github.com/vyasr) - Bradley Dice (https://github.com/bdice) URL: rapidsai#866
…ai#868) This PR supports handling the new main branch strategy outlined below: * [RSN 47 - Changes to RAPIDS branching strategy in 25.12](https://docs.rapids.ai/notices/rsn0047/) The `update-version.sh` script should now supports two modes controlled via `CLI` params or `ENV` vars: CLI arguments: `--run-context=main|release` ENV var `RAPIDS_RUN_CONTEXT=main|release` xref: rapidsai/build-planning#224 Authors: - Nate Rock (https://github.com/rockhowse) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#868
Recently Cython 3.2.0 was released and we have seen a few subtle issues building with it. While we work out these issues, this pins to Cython 3.1, which know to be working for us. Similarly PyTest 9 was recently released, but we have ran into some issues with it as well. So pin to PyTest 8 while we work through PyTest 9 issues. Authors: - https://github.com/jakirkham Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#869
This PR significantly improves POSIX I/O write performance as well as cold-page-cache read by opportunistically using Direct I/O. The speedup for sequential write is approximately 3~4x.
The opportunistic POSIX Direct I/O feature can be controlled in two ways:
- Environment variables:
- `"KVIKIO_AUTO_DIRECT_IO_READ"`: defaults to `false`.
- `"KVIKIO_AUTO_DIRECT_IO_WRITE"`: defaults to `true`.
- C++/Python API
- `defaults::set_auto_direct_io_read(bool flag)`/`kvikio.defaults.set("posix_auto_direct_io_read", flag)`
- `defaults::set_auto_direct_io_write(bool flag)`/`kvikio.defaults.set("posix_auto_direct_io_write", flag)`
In addition, this PR refactors the bounce buffer class. To improve clarity, relevant classes and variables have been renamed and a lot of comments added. The bounce buffer class is now templated by allocator to accommodate different use cases:
- `PageAlignedBounceBufferPool`: used for Direct I/O to/from unaligned host buffer. Does not require CUDA context.
- `CudaPinnedBounceBufferPool`: used for buffered I/O to/from device buffer. Requires CUDA context. This is the original implementation on the main branch.
- `CudaPageAlignedPinnedBounceBufferPool`: used for Direct I/O to/from device buffer. Requires CUDA context.
## Performance results
See rapidsai#863 (comment)
## Goal
- Addresses most part of rapidsai#761
- Addresses the reported write performance issue in cudf
## Non-goal
- This PR does not add opportunistic Direct I/O as file handle's function parameters. This will be revisited in a future PR.
- This PR does not address one of the objectives in rapidsai#520, which is to unify the implementation of the bounce buffer in POSIX IO and in Remote IO. This will be revisited in a future PR.
Authors:
- Tianyu Liu (https://github.com/kingcrimsontianyu)
Approvers:
- Mads R. B. Kristensen (https://github.com/madsbk)
- Vukasin Milovanovic (https://github.com/vuule)
URL: rapidsai#863
Update to 26.02
Forward-merge release/25.12 into main
…#865) RAPIDS has deployed an autoscaling cloud build cluster that can be used to accelerate building large RAPIDS projects. This PR updates the conda and wheel builds to use the build cluster. This contributes to rapidsai/build-planning#228. Authors: - Paul Taylor (https://github.com/trxcllnt) Approvers: - Nate Rock (https://github.com/rockhowse) - Bradley Dice (https://github.com/bdice) URL: rapidsai#865
Forward-merge release/25.12 into main
Contributor
Author
|
Superseded by #878 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
c++
Affects the C++ API of KvikIO
feature request
New feature or request
non-breaking
Introduces a non-breaking change
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.

This PR:
ThreadPoolSimplethat executes tasks with less synchronization overhead. Adds its benchmark to compare against the BS pool.TimeandCPUas part of the benchmark result.