-
Notifications
You must be signed in to change notification settings - Fork 112
feat(binder): specify CPU and memory requests and limits for GPU reservation pod #626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(binder): specify CPU and memory requests and limits for GPU reservation pod #626
Conversation
…s for GPU pod reservation
|
As long as this is hard coded like that it can fit your specific system but not another that has some other weird admission controller - this has to be configurable, and preferably leave the default as minimal as possible like today. |
…and limits for GPU reservation pods
…e configuration in GPU reservation pods
… resources in GPU reservation pods
…pu-pod-reservation
|
Look good, I think that you will need to run |
…pu-pod-reservation # Conflicts: # deployments/kai-scheduler/values.yaml
Merging this branch changes the coverage (2 decrease, 1 increase)
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. Changed unit test files
|
|
Thanks @lokielse |
Description
There is a admission web-hook in our enterprise k8s cluster that checks and requires resources limits to be specified in a pod, otherwise the pod is not allowed to be created.
This PR adds support for configuring CPU and memory resource requests and limits for GPU reservation pods created by the binder. Previously, GPU reservation pods only had GPU resource specifications without explicit CPU/Memory limits or requests, relying on Kubernetes defaults.
Screenshots
Before

After

What Changed
PodResourcesfield to theResourceReservationconfiguration in the binder API--resource-reservation-pod-resourcescommand-line flag accepting JSON-serialized resource requirementsresourceReservationPodResourcesconfigurationKey Features
Implementation Details
The implementation flows through multiple layers:
Related Issues
Fixes #
Checklist
Breaking Changes
None. This is a backward-compatible change. When the new configuration is not specified, the system behaves exactly as before.
Additional Notes
Example Configuration
Users can configure resource limits in their Helm values:
Testing Coverage