-
Notifications
You must be signed in to change notification settings - Fork 4.3k
AEP-8515: Add support for setting a custom request-to-limit ratio at the VPA object level #8516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
AEP-8515: Add support for setting a custom request-to-limit ratio at the VPA object level #8516
Conversation
|
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Hi @iamzili. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
adrianmoisey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the AEP!
I've left 2 small comments, but will definitely need to come back for a more thorough review.
| * **Factor**: Multiplies the recommended request by a specified value, and the result is set as the new limit. | ||
| * Example: if `factor` is set to `2`, the limit will be set to twice the recommended request. | ||
| * **Quantity**: Adds a buffer on top of the resource request. This can be expressed either: | ||
| * As a **percentage** (`QuantityPercentage`), or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would QuantityPercentage not be the same as factor?
ie: QuantityPercentage of 10% is the same as factor of 1.1? (assuming that factor wasn't stored as an int)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point - I was thinking the same: simplify things by dropping QuantityPercentage and renaming QuantityValue to Quantity.
When I first thought about this feature, I thought it might be useful to give users more options to define the ratio's magnitude. But this will just make VPA more complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also considering changing the way the sub-fields are defined, based on AEP-7862, to make the VPA CRD more consistent. For example:
from:
RequestToLimitRatio:
cpu:
Type: Factor
Value: 2
memory:
Type: Quantity
Value: 100Mi
to
RequestToLimitRatio:
cpu:
factor: 2
memory:
quantity: 100Mi
btw after reviewing AEP-7862, I think consistency is broken there when startupBoost is defined under containerPolicies:
startupBoost:
cpu:
type: "Quantity"
quantity: "4"
vs
startupBoost:
cpu:
factor: 1
We should define a common specification to be used for both AEPs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, I think this change makes sense. And good call out on the inconstancy in the startup boost feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, hold on, where is the consistency broken? type default to Factor.
So the second example you pasted implies the following:
startupBoost:
cpu:
type: "Factor"
factor: 1
This is consistent with the other methods, unless I'm missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right! I missed the defaulting part from AEP 7863.
But the consistency between the AEP I created and AEP 7863 is still broken because:
from startupBoost:
Type: "Factor"
Factor: 1from RequestToLimitRatio:
Type: "Factor"
Value: 1 # do I need to rename the key to "Factor"?If you ask me, I think the startupBoost spec is more aligned with the VPA object. Since the startupBoost AEP is already merged into the main branch, I assume we need to follow the same approach in the RequestToLimitRatio AEP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, there should be consistency between this AEP and the startupBoost AEP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consistency fixed
| #### When Enabled | ||
|
|
||
| * The admission controller will **accept** new VPA objects that include a configured `RequestToLimitRatio`. | ||
| * For containers targeted by a VPA object using `RequestToLimitRatio`, the admission controller and/or the updater will enforce the configured ratio. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens in a scenario like this:
- A user has a pod with in-place set. And where requests are equal to limits (and the QoS class is Guaranteed)
- The user modifies the VPA to add a RequestToLimitRatio, making limits double requests
What should the behaviour be? Should the VPA kill the pod to allow a new one be created with a different QoS class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally forgot about QoS classes, thanks for bringing this up!
Since the qosClass field is immutable, we cannot rely on in-place updates, the Pod(s) need to be recreated if the QoS class changes.
To clarify, changing the ratio using this new feature will not evict the Pod(s) immediately. It will rely on the current Updater's behavior like when the current resources.requests differ significantly from the new recommendation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess my concern is actually more of "how will the VPA handle the RequestToLimitRatio changing". Either from the default to a specified ratio, or from one ratio to another.
I think it's worth calling this out in the AEP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added more examples
tspearconquest
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for tagging me in Slack to take a look, and for this AEP!
I added a few questions and ideas.
vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md
Show resolved
Hide resolved
vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md
Show resolved
Hide resolved
vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md
Show resolved
Hide resolved
vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md
Outdated
Show resolved
Hide resolved
vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md
Show resolved
Hide resolved
vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md
Outdated
Show resolved
Hide resolved
| ### Test Plan | ||
|
|
||
| * Implement comprehensive unit tests to cover all new functionality. | ||
| * e2e tests: TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please test extremely high factors like 1 million in addition to more typical factors like 1, 5, 10.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to test something like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't necessarily need to given the response above that limits aren't used for scheduling but it might turn up something interesting in how Kubernetes handles it. You never know what bug might lurk with limits that have never been tested before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tspearconquest , what do you want to test here? if you wanna test a pod with 1 million factor this is unrelated to the VPA work. In tests we need to make sure the VPA behaves as planned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you want to test here?
Integer overflows on the recommender status fields.
if you wanna test a pod with 1 million factor this is unrelated to the VPA work.
Okay, no problem.
In tests we need to make sure the VPA behaves as planned.
No argument here. I'm probably overthinking this, as I don't know Go code very well and don't have the level of understanding you all do with regards to how the recommender status fields and the VPA pods would work with a value exceeding the int64 limit of ~8 exbibytes when converted to an integer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah valid points, in the HPA we do the same tests but as a unit tests.
I guess we can make some unit tests around that as well.
vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md
Outdated
Show resolved
Hide resolved
omerap12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we first agree on this AEP and then we can mark this for an api-review.
Is this step for the core maintainers, or can I help with it to move things forward somehow? |
Once the maintainers are ok with that we will move this to the api-machinery folks for review. |
| ### Test Plan | ||
|
|
||
| * Implement comprehensive unit tests to cover all new functionality. | ||
| * e2e tests: TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of leaving this as a TODO, please add some coverage here. The E2E test should be straightforward - just verify the limit across a few different scenarios. A couple of simple cases would be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, added some e2e tests.
vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md
Outdated
Show resolved
Hide resolved
|
/ok-to-test |
vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md
Outdated
Show resolved
Hide resolved
vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md
Outdated
Show resolved
Hide resolved
|
|
||
| A new `RequestToLimitRatio` field will be added, with the following sub-fields: | ||
|
|
||
| * [Optional] `RequestToLimitRatio.CPU.Type` or `RequestToLimitRatio.Memory.Type` (type `string`): Specifies how to apply limits proportionally to the requests. `Type` can have the following values: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be in line with CPU Startup Boost feature, this needs to be a required field. See #8608
(Sorry for changing it out from under you)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no problem, updated
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: iamzili The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md
Outdated
Show resolved
Hide resolved
|
/api-review cancel |
|
/remove-label api-review |
omerap12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit from me.
@adrianmoisey if you are good with this we can make this for an api-review.
vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md
Outdated
Show resolved
Hide resolved
Just a reminder for @adrianmoisey: according to Omer, the PR can be marked for an API review. Could you please double-check? |
adrianmoisey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, LGTM. I left a few minor nits.
I have another question: what happens if a user's Pod doesn't have limits set, but they have RequestToLimitRatio configured?
Should the VPA set a limit for them?
Or should the VPA ignore limits (much like it does now?)
In my opinion, once this question is answered, let's ask for API review (since this question may change the API, but none of my other nits will)
|
|
||
| ## Summary | ||
|
|
||
| Currently, when VPA is configured to set both requests and limits automatically (i.e., `controlledValues` is set to `RequestsAndLimits` in the VerticalPodAutoscaler CRD), the limit is adjusted proportionally based on the initial request-to-limit ratio defined in workload API objects such as Deployments or StatefulSets - [Ref](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/docs/examples.md#keeping-limit-proportional-to-request). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: technically it's not the workload API that defines this, it's the Pod.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in a commit
|
|
||
| Currently, when VPA is configured to set both requests and limits automatically (i.e., `controlledValues` is set to `RequestsAndLimits` in the VerticalPodAutoscaler CRD), the limit is adjusted proportionally based on the initial request-to-limit ratio defined in workload API objects such as Deployments or StatefulSets - [Ref](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/docs/examples.md#keeping-limit-proportional-to-request). | ||
|
|
||
| If the request-to-limit ratio needs to be updated (for example, because the application's resource usage has changed), users must modify the `resources.requests` or `resources.limits` fields in the workload's API object. Since these fields are immutable, this change results in terminating and recreating the existing Pods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These fields aren't technically immutable, hence being allowed to adjust them.
They are mutable, but since they change the Podspec, new Pods are created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in a commit
|
|
||
| If the request-to-limit ratio needs to be updated (for example, because the application's resource usage has changed), users must modify the `resources.requests` or `resources.limits` fields in the workload's API object. Since these fields are immutable, this change results in terminating and recreating the existing Pods. | ||
|
|
||
| This proposal introduces a new mechanism that allows VPA users to adjust the request-to-limit ratio directly at the VPA CRD level. This mechanism can also be used to to update existing workload + VPA configurations, for example to non-disruptively scale workloads beyond what the their requests/limits would predict. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for example to non-disruptively scale workloads beyond what the their requests/limits would predict
I'm not too sure what this means, can you explain it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! my intention was to emphasize that by leveraging the InPlaceOrRecreate mode together with the new RequestToLimitRatio stanza, users may change the request-to-limit ratio without evicting Pods (i.e. in a non-disruptive way). Of course, this does not apply in every case, for example:
as far as I know, decreasing a Pod's memory limit while its actual memory usage exceeds the new limit will simply cause the kubelet to reject the in-place update.
Should I be more specific about this in the AEP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's really worth mentioning.
The in-place AEP speaks about what happens during an in-place resize. All this AEP is doing is potentially changing the actual limit recommendation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You convinced me, I removed that part
vertical-pod-autoscaler/enhancements/8515-support-custom-request-to-limit-ratio/README.md
Outdated
Show resolved
Hide resolved
|
Oh, another consideration is the Pod's QoS. Generally we don't change the QoS, I also think that the in-place doesn't yet allowing the change of the QoS level. |
True. we don't change the QoS |
…st-to-limit-ratio/README.md Co-authored-by: Adrian Moisey <[email protected]>
Thanks for the review! Regarding your question, my personal opinion is that even if a Pod does not have a limit, when the new |
This is my gut feel too. I think it's worth including in the AEP. |
Yes, in-place updates do not allow changing a Pod's QoS, as this results in an error. This means that after implementing this AEP, when using the |
Yes, I will definitely mention this. @omerap12 what do you think? |
Oh, I see, over here: https://github.com/kubernetes/autoscaler/pull/8516/files#diff-5540c5d94390214fe8bf4289331809652e480a07c1d29e6ac8087c21cd5d21a8R145-R146 |
a752c27 to
db8abfa
Compare
fyi - I just updated that part slightly in commit |
I concur :) |
addressed in 3dd47b2 |
What type of PR is this?
/kind documentation
/kind feature
What this PR does / why we need it:
Autoscaling Enhancement Proposal (AEP) to add support for setting a custom request-to-limit ratio at the VPA object level.
I'd love to hear your thoughts on this feature.
#8515