-
Notifications
You must be signed in to change notification settings - Fork 156
ROX-29674: Sync caBundle changes to ValidatingWebhookConfiguration #15706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
fd7db5a
to
4abe2bb
Compare
Images are ready for the commit at 35f8e5e. To use with deploy scripts, first |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #15706 +/- ##
========================================
Coverage 48.69% 48.69%
========================================
Files 2602 2605 +3
Lines 191502 191740 +238
========================================
+ Hits 93249 93365 +116
- Misses 90923 91041 +118
- Partials 7330 7334 +4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
ec4110a
to
5e83358
Compare
5e83358
to
747a358
Compare
Caution There are some errors in your PipelineRun template.
|
a9c4842
to
8787e47
Compare
bb05024
to
31a2e1e
Compare
f5dfb2e
to
a0cc8ce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How difficult would it be to change the caching to look at an additional label as an alternative to the current one? The current approach of using the same label will probably work fine but I think a separate one would reduce confusion later....
// This is needed so that the Operator can update the ValidatingWebhookConfiguration's caBundle field. | ||
caBundle, err := t.getCABundleFromConfigMap(ctx, sc) | ||
if err != nil { | ||
t.logger.Error(err, "failed to get CA bundle from ConfigMap", "configMap", securedcluster.CABundleConfigMapName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the errors should be propagated rather than ignored. Otherwise the bundle might get misconfigured on transient client.Get
errors for example 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. My thinking was that since this ConfigMap is "optional" then if it can't be read we should just use the fallback CA from the TLS secrets. But my approach might actually cause admission-control to stop working temporarily if there are transient errors.
} | ||
|
||
func (p *CreateOrUpdateWithNamePredicate[T]) Delete(_ event.TypedDeleteEvent[T]) bool { | ||
return false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it would be better to trigger on deletions too? A deletion will be processed eventually (when operator restarts), so ignoring it here seems a bit artificial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's true that deletion will be processed eventually, but I would tend towards not implementing support in the predicate. Because deletion is not part of the workflow, I think that the current approach communicates intent better.
95aab6d
to
204e840
Compare
|
||
caBundlePEM, ok := configMap.Data[caBundleKey] | ||
if !ok { | ||
return "", errors.Errorf("key %q not found in ConfigMap %s", caBundleKey, key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was wondering if it would make sense to treat this case the same as "config map not found". 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say no, because "not found" is a valid state (e.g. Sensor didn't create the ConfigMap yet). The ConfigMap existing but without the key is not a valid state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine. I guess I was just looking at it from a defensive programming direction (be strict in what you publish, be permissive in what you consume).
Feel free to resolve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a good point, but I would still be wary in this particular case. Being permissive here will result in falling back to the original CA, and that could actually cause break policy enforcement in admission-controller during CA rotation.
}{ | ||
{ | ||
name: "ConfigMap does not exist should return empty string without error", | ||
setupClient: func(t *testing.T) ctrlClient.Client { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about reorganizing this a bit with the goal of making the test a bit concise:
Couldn't we just add something like a []&v1.ConfigMap
to the test-case struct and instantiate the fake client builder with these objects pre-created in Run
below (~ L1305)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -1222,3 +1223,197 @@ func createSecret(name string) *v1.Secret { | |||
}, | |||
} | |||
} | |||
|
|||
func TestGetCABundleFromConfigMap(t *testing.T) { | |||
const testNamespace = "stackrox" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about using a non-standard namespace to make sure that there is no implicit relying on the stackrox
namespace during the fetching of the CM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the namespace in this test and it works, but in the other test that I added (TestTranslateWithCABundle
) it doesn't. That's because all the existing helper functions in this file are hardcoding the stackrox
namespace.
Should I make it configurable? It feels like an useful change, but I don't like to mix unrelated changes in the same PR.
f4f2ef0
to
204e840
Compare
It doesn't seem easy. The Something that would work is to use the same key, e.g. "app.stackrox.io/managed-by" in ("operator", "sensor") and then the Operator would also cache resources managed by Sensor. WDYT? (there's a caveat, Sensor currently uses |
@vladbologa: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
LOL, nothing is ever easy 😄 Thank you for the investigation! |
How about using |
Sorry, I'm confused. Which component would set which labels on which resources, and how would the label selector for operator caching look like then? 🤔 |
In my initial approach, I was proposing that Sensor would put the So the alternative is that Sensor sets req, err := labels.NewRequirement(
"app.stackrox.io/managed-by",
selection.In,
[]string{"operator", "sensor"},
)
cacheLabelSelector := labels.NewSelector().Add(*req) I think this should work, because it's the same label key. |
Description
This PR implements an Operator-side mechanism that will enable Sensor to propagate a CA bundle to the
ValidatingWebhookConfiguration
ofadmission-control
in a Secured Cluster, to support CA rotation.Context
The
admission-control
service acts as a KubernetesValidatingWebhook
. The Kubernetes API server needs to trust the TLS certificate presented byadmission-control
, and this trust is established via thecaBundle
field in theValidatingWebhookConfiguration
resource.During CA rotation, Sensor obtains new TLS certificates from Central that were signed using a new CA. These certificates are stored as secrets that the services can access. In addition, it also needs to update the
caBundle
field of theValidatingWebhookConfiguration
.Why this PR is needed
ValidatingWebhookConfiguration
resource, because it is managed by the OperatorValidatingWebhookConfiguration
should learn about a new CA beforeadmission-control
starts using leaf certificates signed by itProposed solution
Since Sensor cannot modify
ValidatingWebhookConfiguration
, it will instead store the CA bundle in a ConfigMap (this is not implemented here).The Operator will then watch this ConfigMap, and when it appears or it gets updated, it updates the
caBundle
of theValidatingWebhookConfiguration
accordingly.Notes
tls-ca-bundle
ConfigMap)app.stackrox.io/watched-by: operator
label, so that the Operator adds it to its resource cacheUser-facing documentation
Testing and quality
Automated testing
How I validated my change
Deployed in an infra cluster with
make -C operator deploy-via-olm
.Installed Central and added a Secured Cluster.
Created a test ConfigMap in the namespace and verified that changes to the
tls-ca-bundle
ConfigMap propagate to the ValidatingWebhookConfiguration resource.