-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Run scheduler predicates in parallel #8729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| forceDeleteFailedNodes = flag.Bool("force-delete-failed-nodes", false, "Whether to enable force deletion of failed nodes, regardless of the min size of the node group the belong to.") | ||
| enableDynamicResourceAllocation = flag.Bool("enable-dynamic-resource-allocation", false, "Whether logic for handling DRA (Dynamic Resource Allocation) objects is enabled.") | ||
| clusterSnapshotParallelism = flag.Int("cluster-snapshot-parallelism", 16, "Maximum parallelism of cluster snapshot creation.") | ||
| predicateParallelism = flag.Int("predicate-parallelism", 4, "Maximum parallelism of scheduler predicate checking.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we fail if someone passes in 0 (and/or should we enforce an upper boundary?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea - added validation for lower bound. For upper bound I don't see the need to add an artificial limit, so not checking it.
Setting this initially to 4 goroutines. Setting parallelism higher than 4 seems to yield diminishing returns at this point: $ go test -bench=BenchmarkRunFiltersUntilPassingNode ./simulator/clustersnapshot/predicate/ (...) BenchmarkRunFiltersUntilPassingNode/parallelism-1-16 141 8206978 ns/op BenchmarkRunFiltersUntilPassingNode/parallelism-2-16 153 7123724 ns/op BenchmarkRunFiltersUntilPassingNode/parallelism-4-16 183 6997209 ns/op BenchmarkRunFiltersUntilPassingNode/parallelism-8-16 174 7161056 ns/op BenchmarkRunFiltersUntilPassingNode/parallelism-16-16 178 7068643 ns/op This is because the function is currently dominated by ListNodeInfos which causes frequent memory allocation during WrapSchedulerNodeInfo calls. Since NodeInfo is an interface now, we should be able to avoid costly object wrapping on listing, at which point it may make sense to bump this parallelism further.
| return NewSchedulerPluginRunner(fwHandle, snapshot, 1), snapshot, nil | ||
| } | ||
|
|
||
| func BenchmarkRunFiltersUntilPassingNode(b *testing.B) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to add something like go test -run=^$ -bench=. ./... to a make target so we can start getting visibility into benchmark tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make benchmark? Yeah, I think that makes sense. Will follow up with a PR.
jackfrancis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
Ran benchmarks manually
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jackfrancis, x13n The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Setting this initially to 4 goroutines. Setting parallelism higher than 4 seems to yield diminishing returns at this point:
This is because the function is currently dominated by ListNodeInfos which causes frequent memory allocation during WrapSchedulerNodeInfo calls. Since NodeInfo is an interface now, we should be able to avoid costly object wrapping on listing, at which point it may make sense to bump this parallelism further.
What type of PR is this?
What this PR does / why we need it:
Performance improvement. Until now, all scheduler predicates were executed in a single thread.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: