-
Notifications
You must be signed in to change notification settings - Fork 19
Configurable DrainerTimeout #152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: vaspahomov <[email protected]>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: vaspahomov The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
WalkthroughA new configurable drainer timeout is introduced for the NodeMaintenanceReconciler, replacing the previously hardcoded value. This timeout can now be set via a command-line flag and is passed through the main application, controller, and tests. Associated struct fields, function signatures, and test logic are updated to use this configurable value. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Main
participant NodeMaintenanceReconciler
participant Drainer
User->>Main: Start with --drainer-timeout flag
Main->>NodeMaintenanceReconciler: Initialize with DrainerTimeout
NodeMaintenanceReconciler->>Drainer: createDrainer(..., DrainerTimeout)
Drainer-->>NodeMaintenanceReconciler: Drainer with configured timeout
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
|
Hi @vaspahomov. Thanks for your PR. I'm waiting for a medik8s member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
controllers/controllers_suite_test.go (1)
90-105: Compilation will break –createDrainercall missing the newtimeoutargument
createDrainernow requires three parameters(ctx, cfg, timeout)but the test suite still calls it with only two.
This will prevent the test package (and the whole build) from compiling.- drainer, err = createDrainer(ctx, cfg) + // use the same timeout that the reconciler under test is configured with + drainer, err = createDrainer(ctx, cfg, r.DrainerTimeout)
♻️ Duplicate comments (1)
controllers/nodemaintenance_controller.go (1)
126-129: Error handling path depends on valid timeoutBecause the timeout is injected here, any invalid (≤ 0) value slips straight into the helper and disables the safety net.
This ties back to the validation suggestion above.
🧹 Nitpick comments (3)
controllers/nodemaintenance_controller.go (1)
71-72: Zero/negative timeout leaves drain running indefinitely – validateDrainerTimeoutA value ≤ 0 disables the timeout according to
kubectl/drain.Helpersemantics.
Unless you explicitly want to allow “infinite” drains, add a guard that falls back toDefaultDrainerTimeoutand log a warning when an invalid value is provided.+ if r.DrainerTimeout <= 0 { + r.logger.Info("Invalid drainer timeout requested – falling back to default", + "requested", r.DrainerTimeout, "default", DefaultDrainerTimeout) + r.DrainerTimeout = DefaultDrainerTimeout + }main.go (1)
84-86: Add basic flag validation for--drainer-timeoutUsers can now set
--drainer-timeout=0or a negative duration, unintentionally disabling the timeout.
Fail fast (or at least warn) during startup:flag.DurationVar(&drainerTimeout, "drainer-timeout", controllers.DefaultDrainerTimeout, "Timeout for draining a node.") +flag.Parse() + +if drainerTimeout <= 0 { + setupLog.Error(fmt.Errorf("invalid drainer-timeout"), "timeout must be > 0") + os.Exit(1) +} - -flag.Parse()test/e2e/node_maintenance_test.go (1)
345-346: Minor readability nit – cast after additionThe current expression is correct, but casting the whole sum avoids precedence surprises and intent is clearer:
- TerminationGracePeriodSeconds: ptr.To[int64](int64(nodemaintenance.DefaultDrainerTimeout.Seconds()) + 50), + TerminationGracePeriodSeconds: ptr.To[int64](int64(nodemaintenance.DefaultDrainerTimeout.Seconds()+50)),
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
controllers/controllers_suite_test.go(1 hunks)controllers/nodemaintenance_controller.go(5 hunks)main.go(5 hunks)test/e2e/node_maintenance_test.go(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (3)
test/e2e/node_maintenance_test.go (1)
controllers/nodemaintenance_controller.go (1)
DefaultDrainerTimeout(59-59)
controllers/controllers_suite_test.go (1)
controllers/nodemaintenance_controller.go (1)
DefaultDrainerTimeout(59-59)
controllers/nodemaintenance_controller.go (1)
vendor/k8s.io/kubectl/pkg/drain/drain.go (1)
Helper(51-96)
|
IIUC, your suggestion follows the general line of #122 |
slintes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty straight forward, works for me.
@mshitrit thoughts?
| "Enable leader election for controller manager. "+ | ||
| "Enabling this will ensure there is only one active controller manager.") | ||
| flag.BoolVar(&enableHTTP2, "enable-http2", false, "If HTTP/2 should be enabled for the metrics and webhook servers.") | ||
| flag.DurationVar(&drainerTimeout, "drainer-timeout", controllers.DefaultDrainerTimeout, "Timeout for draining a node.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to check for reasonable values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not relevant to this PR, but this looks to me like something that would make sense to put in a configuration not ?
Would also simplify values validation.
Signed-off-by: vaspahomov <[email protected]>
@vaspahomov is this PR replacing #151 ? |
It's better to have them both. Both PRs are related to speeding up simultaneous |
|
/test 4.18-openshift-e2e |
|
@vaspahomov Hi, how do you install NMO, with OLM? I'm wondering how to use such new flags without further modifications the the bundle? 🤔 We can set env vars in the Subscription, but I'm not aware how to modify the command... |
Why we need this PR
Idea is the same as in #151
Here we are trying to lower single reconcile maximum duration.
Changes made
Made DrainerTimeout configurable.
Which issue(s) this PR fixes
Test plan
Summary by CodeRabbit
New Features
Tests