-
Notifications
You must be signed in to change notification settings - Fork 78
Open
Labels
area/uxIn pursuit of a delightful user experienceIn pursuit of a delightful user experienceenhancementNew feature or requestNew feature or request
Description
Every time the image-reflector-controller pod goes down for any reason, it isn't waiting for running reconciliations to finish, and propagates an immediate context cancellation to the running contexts.
This causes running reconciliations to fail and return an error, which in turn is propagated to all channels defined in the notification controller, generating a lot of noise.
Ideally, the running reconciliations should be allowed to finish, or at the worst these expected context cancellation shutdown errors should not be propagated to the notification channels.
Here's an example log of the issue:
{"level":"info","ts":"2024-01-11T11:41:01.305Z","msg":"Stopping and waiting for non leader election runnables"}
{"level":"info","ts":"2024-01-11T11:41:01.312Z","msg":"Stopping and waiting for leader election runnables"}
{"level":"info","ts":"2024-01-11T11:41:01.324Z","msg":"All workers finished","controller":"imagepolicy","controllerGroup":"image.toolkit.fluxcd.io","controllerKind":"ImagePolicy"}
{"level":"info","ts":"2024-01-11T11:41:01.312Z","msg":"Shutdown signal received, waiting for all workers to finish","controller":"imagepolicy","controllerGroup":"image.toolkit.fluxcd.io","controllerKind":"ImagePolicy"}
{"level":"info","ts":"2024-01-11T11:41:01.312Z","msg":"Shutdown signal received, waiting for all workers to finish","controller":"imagerepository","controllerGroup":"image.toolkit.fluxcd.io","controllerKind":"ImageRepository"}
{"level":"error","ts":"2024-01-11T11:41:01.332Z","msg":"failed to configure authentication options: operation error ECR: GetAuthorizationToken, get identity: get credentials: request canceled, context canceled","controller":"imagerepository","controllerGroup":"image.toolkit.fluxcd.io","controllerKind":"ImageRepository","ImageRepository":{"name":"**REDACTED**","namespace":"**REDACTED**"},"namespace":"**REDACTED**","name":"**REDACTED**","reconcileID":"64c8c4f2-e200-4466-bb1a-0059375fc0d6","error":"AuthenticationFailed"}
{"level":"info","ts":"2024-01-11T11:41:01.338Z","msg":"Warning: Reconciler returned both a non-zero result and a non-nil error. The result will always be ignored if the error is non-nil and the non-nil error causes reqeueuing with exponential backoff. For more details, see: https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/reconcile#Reconciler","controller":"imagerepository","controllerGroup":"image.toolkit.fluxcd.io","controllerKind":"ImageRepository","ImageRepository":{"name":"**REDACTED**","namespace":"**REDACTED**"},"namespace":"**REDACTED**","name":"**REDACTED**","reconcileID":"9f627893-53d1-4bdc-acac-a0b3099216e5"}
{"level":"error","ts":"2024-01-11T11:41:01.339Z","msg":"Reconciler error","controller":"imagerepository","controllerGroup":"image.toolkit.fluxcd.io","controllerKind":"ImageRepository","ImageRepository":{"name":"**REDACTED**","namespace":"**REDACTED**"},"namespace":"**REDACTED**","name":"**REDACTED**","reconcileID":"9f627893-53d1-4bdc-acac-a0b3099216e5","error":"[Patch \"https://10.16.0.1:443/apis/image.toolkit.fluxcd.io/v1beta2/namespaces/**REDACTED**/imagerepositories/**REDACTED**/status?fieldManager=image-reflector-controller\": context canceled, context canceled]","errorCauses":[{"error":"[Patch \"https://10.16.0.1:443/apis/image.toolkit.fluxcd.io/v1beta2/namespaces/**REDACTED**/imagerepositories/**REDACTED**/status?fieldManager=image-reflector-controller\": context canceled, context canceled]","errorCauses":[{"error":"Patch \"https://10.16.0.1:443/apis/image.toolkit.fluxcd.io/v1beta2/namespaces/**REDACTED**/imagerepositories/**REDACTED**/status?fieldManager=image-reflector-controller\": context canceled"},{"error":"context canceled"}]}]}
{"level":"error","ts":"2024-01-11T11:41:01.373Z","msg":"Reconciler error","controller":"imagerepository","controllerGroup":"image.toolkit.fluxcd.io","controllerKind":"ImageRepository","ImageRepository":{"name":"**REDACTED**","namespace":"**REDACTED**"},"namespace":"**REDACTED**","name":"**REDACTED**","reconcileID":"64c8c4f2-e200-4466-bb1a-0059375fc0d6","error":"[failed to configure authentication options: operation error ECR: GetAuthorizationToken, get identity: get credentials: request canceled, context canceled, context canceled]","errorCauses":[{"error":"failed to configure authentication options: operation error ECR: GetAuthorizationToken, get identity: get credentials: request canceled, context canceled"},{"error":"context canceled","errorCauses":[{"error":"context canceled"},{"error":"context canceled"}]}]}
{"level":"info","ts":"2024-01-11T11:41:01.375Z","msg":"All workers finished","controller":"imagerepository","controllerGroup":"image.toolkit.fluxcd.io","controllerKind":"ImageRepository"}
{"level":"info","ts":"2024-01-11T11:41:01.375Z","msg":"Stopping and waiting for caches"}
{"level":"info","ts":"2024-01-11T11:41:01.384Z","msg":"Stopping and waiting for webhooks"}
{"level":"info","ts":"2024-01-11T11:41:01.384Z","msg":"Stopping and waiting for HTTP servers"}
{"level":"info","ts":"2024-01-11T11:41:01.384Z","msg":"shutting down server","kind":"health probe","addr":"[::]:9440"}
{"level":"info","ts":"2024-01-11T11:41:01.389Z","logger":"controller-runtime.metrics","msg":"Shutting down metrics server with timeout of 1 minute"}
{"level":"info","ts":"2024-01-11T11:41:01.396Z","msg":"Wait completed, proceeding to shutdown the manager"}
{"level":"error","ts":"2024-01-11T11:41:01.408Z","msg":"error received after stop sequence was engaged","error":"leader election lost"}
Metadata
Metadata
Assignees
Labels
area/uxIn pursuit of a delightful user experienceIn pursuit of a delightful user experienceenhancementNew feature or requestNew feature or request