Commit c05af1d
authored
feat(api,ui,sdk): Make CPU limits configurable (#586)
# Description
As of present, users are not able to configure the CPU limits of the
pods in which Merlin models and transformers are deployed in - they are
instead determined automatically on the platform-level (Merlin API
server). Depending on how the API server has been configured, one of the
following happens:
- the CPU limit of a model is set as its CPU request value, multiplied
by a [scaling
factor](https://github.com/caraml-dev/merlin/blob/f1ebe099ea168988b365ee72ce08543b127826e1/api/config/config.go#L364)
(e.g. 2 CPU * 1.5) **or,**
- Note that this is the existing way memory limits are automatically set
by the Merlin API server
- the CPU limit is left unset
- Note that because KServe does not currently allow CPU limits to be
completely unset, the Merlin API server instead sets an [arbitrary value
](https://github.com/caraml-dev/merlin/blob/f1ebe099ea168988b365ee72ce08543b127826e1/api/config/config.go#L363)(ideally
one that is very big) as the CPU limit instead
This PR introduces a new workflow which would allow users to instead
override the platform-level CPU limits (described in the paragraph
above) set on a model. This workflow is available via the UI, SDK and by
extension, directly calling the API endpoint of the API server.
UI:


SDK:
```python
merlin.deploy(
version_1,
resource_request=merlin.ResourceRequest(
min_replica=0,
max_replica=0,
cpu_request="0.5",
cpu_limit="2",
memory_request="1Gi",
),
)
```
In addition, this PR adds a new configuration,
`DefaultEnvVarsWithoutCPULimits`, which is a list of env vars that
automatically get added to all Merlin models and transformers when CPU
limits are not set. This allows the Merlin API server's operators to set
env vars platform-wide that can potentially improve these deployments'
performance, e.g. env vars involving concurrency.
# Modifications
- `api/cluster/resource/templater.go` - Refactoring of templater methods
to set default env vars when cpu limits are not explicitly set and when
the cpu limit scaling factor is set as 0
- `api/config/config.go` - Addition of the new field
`DefaultEnvVarsWithoutCPULimits`
- `api/config/config_test.go` - Addition of a new unit test to test the
parsing of configs from .yaml files
- `docs/user/templates/model_deployment/01_deploying_a_model_version.md`
- Addition of docs to demonstrate how the platform-level CPU limits can
be overriden
- `python/sdk/merlin/resource_request.py` - Addition of a new cpu limit
field to the resource request class
-
`ui/src/pages/version/components/forms/components/CPULimitsFormGroup.js`
- Addition of a new form group to allow cpu limits to be specified on
the UI
# Tests
- [x] Deploying existing models (and transformers) with and without CPU
limits set
# Checklist
- [x] Added PR label
- [x] Added unit test, integration, and/or e2e tests
- [x] Tested locally
- [x] Updated documentation
- [x] Update Swagger spec if the PR introduce API changes
- [x] Regenerated Golang and Python client if the PR introduces API
changes
# Release Notes
<!--
Does this PR introduce a user-facing change?
If no, just write "NONE" in the release-note block below.
If yes, a release note is required. Enter your extended release note in
the block below.
If the PR requires additional action from users switching to the new
release, include the string "action required".
For more information about release notes, see kubernetes' guide here:
http://git.k8s.io/community/contributors/guide/release-notes.md
-->
```release-note
NONE
```1 parent d948254 commit c05af1d
File tree
39 files changed
+1310
-184
lines changed- .github/workflows
- codecov-config
- api
- client
- cluster/resource
- config
- testdata
- models
- docs
- images
- user
- generated/model_deployment
- templates/model_deployment
- python/sdk
- client/models
- merlin
- test
- ui/src
- components
- pages/version/components/forms
- components
- steps
- validation
- services
- transformer
- version_endpoint
39 files changed
+1310
-184
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
141 | 141 | | |
142 | 142 | | |
143 | 143 | | |
| 144 | + | |
144 | 145 | | |
145 | 146 | | |
146 | 147 | | |
| |||
191 | 192 | | |
192 | 193 | | |
193 | 194 | | |
| 195 | + | |
194 | 196 | | |
195 | 197 | | |
196 | 198 | | |
| |||
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
192 | 192 | | |
193 | 193 | | |
194 | 194 | | |
195 | | - | |
196 | | - | |
197 | | - | |
198 | | - | |
199 | | - | |
200 | | - | |
201 | | - | |
202 | | - | |
203 | | - | |
204 | | - | |
205 | | - | |
206 | | - | |
207 | | - | |
208 | | - | |
209 | | - | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
210 | 198 | | |
| 199 | + | |
211 | 200 | | |
212 | 201 | | |
213 | 202 | | |
| |||
329 | 318 | | |
330 | 319 | | |
331 | 320 | | |
332 | | - | |
| 321 | + | |
333 | 322 | | |
334 | 323 | | |
335 | 324 | | |
| |||
364 | 353 | | |
365 | 354 | | |
366 | 355 | | |
367 | | - | |
| 356 | + | |
368 | 357 | | |
369 | 358 | | |
370 | 359 | | |
| |||
392 | 381 | | |
393 | 382 | | |
394 | 383 | | |
395 | | - | |
396 | | - | |
397 | | - | |
398 | | - | |
399 | | - | |
400 | | - | |
401 | | - | |
402 | | - | |
403 | | - | |
404 | | - | |
405 | | - | |
406 | | - | |
407 | | - | |
408 | | - | |
409 | | - | |
410 | | - | |
411 | | - | |
412 | | - | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
413 | 387 | | |
414 | 388 | | |
415 | | - | |
416 | | - | |
417 | 389 | | |
418 | 390 | | |
419 | 391 | | |
| |||
780 | 752 | | |
781 | 753 | | |
782 | 754 | | |
783 | | - | |
784 | | - | |
785 | | - | |
| 755 | + | |
| 756 | + | |
| 757 | + | |
| 758 | + | |
| 759 | + | |
| 760 | + | |
| 761 | + | |
786 | 762 | | |
787 | 763 | | |
788 | 764 | | |
| |||
910 | 886 | | |
911 | 887 | | |
912 | 888 | | |
| 889 | + | |
| 890 | + | |
| 891 | + | |
| 892 | + | |
| 893 | + | |
| 894 | + | |
| 895 | + | |
| 896 | + | |
| 897 | + | |
| 898 | + | |
| 899 | + | |
| 900 | + | |
| 901 | + | |
| 902 | + | |
| 903 | + | |
| 904 | + | |
| 905 | + | |
| 906 | + | |
| 907 | + | |
| 908 | + | |
| 909 | + | |
| 910 | + | |
| 911 | + | |
| 912 | + | |
| 913 | + | |
| 914 | + | |
| 915 | + | |
| 916 | + | |
| 917 | + | |
| 918 | + | |
| 919 | + | |
| 920 | + | |
| 921 | + | |
| 922 | + | |
| 923 | + | |
| 924 | + | |
| 925 | + | |
| 926 | + | |
| 927 | + | |
| 928 | + | |
| 929 | + | |
| 930 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
58 | 58 | | |
59 | 59 | | |
60 | 60 | | |
61 | | - | |
| 61 | + | |
62 | 62 | | |
63 | 63 | | |
64 | 64 | | |
| |||
0 commit comments