-
Notifications
You must be signed in to change notification settings - Fork 128
Support namespaced {Sriov,SriovIB,OVS}Networks #894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support namespaced {Sriov,SriovIB,OVS}Networks #894
Conversation
|
Thanks for your PR,
To skip the vendors CIs, Maintainers can use one of:
|
000da88 to
53198d5
Compare
Pull Request Test Coverage Report for Build 17231697992Details
💛 - Coveralls |
a169538 to
cd2efd9
Compare
cd2efd9 to
fe1f3a0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances the support for namespaced XNetwork resources by allowing network objects (SriovNetwork, SriovIBNetwork, OVSNetwork) to be created in namespaces other than the operator’s, enforcing constraints via a validating webhook and related tests. Key changes include:
- Refactoring network node policy tests by moving them to a dedicated file.
- Adding webhook validation and tests for the .Spec.NetworkNamespace field on network objects.
- Updating controller reconciliation logic to handle resources in non-operator namespaces and improve finalizer handling.
- Minor adjustments in logging and manifests for webhook configuration.
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| test/conformance/tests/test_sriov_operator.go | Removed redundant “No SriovNetworkNodePolicy” tests to be replaced by a separate file. |
| test/conformance/tests/test_no_policy.go | Added dedicated tests for validating network policies and namespace constraints. |
| pkg/webhook/webhook.go | Implemented validation cases for the three network types using json.Unmarshal. |
| pkg/webhook/validate_networks*.go, validate_networks_test.go | Introduced functions and tests for enforcing .Spec.NetworkNamespace constraints. |
| controllers/generic_network_controller.go | Updated reconciliation logic to fetch objects from both operator and resource namespaces, update finalizers appropriately, and set owner references when applicable. |
| controllers/sriovnetwork_controller_test.go | Added new test cases to verify proper behavior when network objects are created in non-operator namespaces. |
| cmd/webhook/main.go & bindata/manifests/operator-webhook/003-webhook.yaml | Adjusted logging initialization and expanded webhook configuration for new resource types. |
Comments suppressed due to low confidence (1)
test/conformance/tests/test_no_policy.go:182
- The parameter name 'webookEnabled' appears to be a typo; consider renaming it to 'webhookEnabled' for improved clarity.
DescribeTable("should gracefully restart quickly", func(webookEnabled bool) {
SchSeba
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great work!
I left some small comments
| operatorNamespacedName := req.NamespacedName | ||
| operatorNamespacedName.Namespace = vars.Namespace | ||
|
|
||
| err = r.Get(ctx, operatorNamespacedName, instance) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we need to also check in the operator namespace?
can you please add a comment to explain why so I will remember next time I look at this code :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is because we reconcile NetworkAttachmentDefinitions updates, and here we have to fetch the source object.
the cases are
|| Reconciled object || source object ||
| netAttachDef(nsX/net1) | SriovNetwork(operatorNs/net1) with networkNamespace: nsX |
| netAttachDef(nsX/net1) | SriovNetwork(nsX/net1) |
| SriovNetwork(operatorNs/net1) | SriovNetwork(operatorNs/net1) |
| SriovNetwork(nsX/net1) | SriovNetwork(nsX/net1) |
adding a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we handle this case in an eventHandler registered in SetupWithManager ?
so instead of using handler.EnqueuRequestForObject{} we would implement our own with handler.EnqueueRequestFromMapFn there we can try to get the "source" object from current net-attach-def namespace or from the operator ns.
then Reconcile would get the correct object it needs to reconcile, WDYT ?
|
|
||
| func validateNetworkNamespace(cr controllers.NetworkCRInstance) error { | ||
| if cr.GetNamespace() != vars.Namespace && cr.NetworkNamespace() != "" { | ||
| return fmt.Errorf(".Spec.NetworkNamespace field can't be specified if the resource is not in the %s namespace", vars.Namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also do we want to block the creation there is already a network from the operator namespace with the same name?
or if there already a net-attach-def on the requested namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a little bit debatable: having two or more objects contending the same NetAttachDef is handled in
It's the same problem, viewed from the perspective of the webhook, and we already have the same with an SriovNetwork + SriovIBNetwork pointing to the same object.
I prefer to address it in #897 or in another subsequent PR, after discussing in a community meeting. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
works for me
| }) | ||
| }) | ||
|
|
||
| DescribeTable("should gracefully restart quickly", func(webookEnabled bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the one that I removed so can you leave it in the original place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To leave it where it was, I have to put back all the No SriovNetworkPolicy test tree. We'll get a conflict in any case and I'll solve it. Let's merge
before this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now that 901 is merged I think we can remove this one
| }) | ||
|
|
||
| Context("Namespaced network objects", func() { | ||
| BeforeAll(func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can you remove this empty one?
| ) | ||
| }) | ||
| }) | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a check here that if network in the operator namespace exist it should win always ones created by the user on it's own namespace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case is handled in
but no such priority has been implemented yet. That PR handles these collisions by making the first object win.
for example, if a namespaced network is created, then the created netattachdef has the owner annotation. When the operatornamespaced network is created, it finds the target netattachdefs already owned by another object and the controller does nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good
fe1f3a0 to
1706fc6
Compare
| defer func(previous string) { vars.Namespace = previous }(vars.Namespace) | ||
| vars.Namespace = "operator-namespace" | ||
|
|
||
| validNetworkNamespaces := []controllers.NetworkCRInstance{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any chance to have a "[]tcase" (list of test case struct) which contains the test name, the NetworkCRInstance and a shouldFail boolean ? then the test would iterate over entries , run validateNetworkNamespace and compare to expected result ?
| if instance.NetworkNamespace() != "" && instance.GetNamespace() != vars.Namespace { | ||
| reqLogger.Error( | ||
| fmt.Errorf("bad value for NetworkNamespace"), | ||
| ".Spec.NetworkNamespace can't be specified if the resource belongs to a namespace other than the operator's", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit do we want to specify "jsonpath" style field refs ?
e.g .spec.networkNamespace, .metadata.namespace ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, that would be better for the user point of view
| operatorNamespacedName := req.NamespacedName | ||
| operatorNamespacedName.Namespace = vars.Namespace | ||
|
|
||
| err = r.Get(ctx, operatorNamespacedName, instance) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we handle this case in an eventHandler registered in SetupWithManager ?
so instead of using handler.EnqueuRequestForObject{} we would implement our own with handler.EnqueueRequestFromMapFn there we can try to get the "source" object from current net-attach-def namespace or from the operator ns.
then Reconcile would get the correct object it needs to reconcile, WDYT ?
SchSeba
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from my side LGTM.
to me it's not an hard request to have the object retreval in the handler, but it's nice to have.
1706fc6 to
d38c8dc
Compare
|
I added the handleNetAttachDef function and it actually made the logic a little straightforward. @adrianchiris @SchSeba please take a look |
SchSeba
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I link this new approach!
I learned something new :)
LGTM
|
Hi @zeeke, our QE did the following test on this PR
And on the last step when he check the NAD/net1 the owner reference was removed. So we probably have an issue with the update. |
| return reconcile.Result{}, nil | ||
| } | ||
|
|
||
| if instance.NetworkNamespace() != "" && instance.GetNamespace() != vars.Namespace { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to add error field into *NetworkStatus objects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good idea, but I would discuss it after
Maybe the conditions on CRDs are enough. What do you think?
fixed. @adrianchiris @e0ne @SchSeba please, take another look |
b61c76b to
c0347db
Compare
|
@adrianchiris @e0ne can you please take another look at this PR? |
adrianchiris
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
final nits otherwise LGTM
| // Request object not found, could have been deleted after reconcile request. | ||
| // Owned objects are automatically garbage collected. For additional cleanup logic use finalizers. | ||
| // Return and don't requeue | ||
| reqLogger.Error(err, "XXX2") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove debug prints ::)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure! sorry for that leftover
| // Request object not found, could have been deleted after reconcile request. | ||
| // Owned objects are automatically garbage collected. For additional cleanup logic use finalizers. | ||
| // Return and don't requeue | ||
| reqLogger.Info("XXX3") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove debug prints ::)
| Expect(k8sClient.Create(context.Background(), nsBlue)).ToNot(HaveOccurred()) | ||
| DeferCleanup(func() { | ||
| By("deleting ns-blue") | ||
| err := k8sClient.Delete(ctx, nsBlue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with envtest you cant really delete namespace, it will just stay dangling.
https://book.kubebuilder.io/reference/envtest#namespace-usage-limitation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know that, thanks!
I added a cleanNetworksInNamespace function to remove all the resources after the test
|
|
||
| Eventually(func(g Gomega) { | ||
| netAttDef = &netattdefv1.NetworkAttachmentDefinition{} | ||
| err = util.WaitForNamespacedObject(netAttDef, k8sClient, "ns-blue", cr.GetName(), util.RetryInterval, util.Timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: no need to call this wait func, object already exists, you can just call g.Expect(dynclient.Get(....)).To(Succeed())
its all wrapped in Eventually block anyway
|
|
||
| // Delete the SriovNetwork | ||
| err = k8sClient.Delete(ctx, &cr) | ||
| Expect(err).NotTo(HaveOccurred()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets wait for the underlying net-attach-def to be deleted ?
| Should(Succeed()) | ||
| }) | ||
|
|
||
| Context("When the SriovNetwork namespace is not equal to the operator one", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a case where the user requests to create nad in a different ns than the current one ?
then check nad doesnt get created with a Consistently block ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added!
| instance := r.controller.GetObject() | ||
| nadNamespacedName := types.NamespacedName{Namespace: obj.GetNamespace(), Name: obj.GetName()} | ||
|
|
||
| log.Log.WithName("XXX").Info("handleNetAttDef", "nadNamespacedName", nadNamespacedName, "vars.Namespace", vars.Namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove debug logs (XXX/XXX11 etc)
c0347db to
8e133f4
Compare
| ExpectWithOffset(1, err).NotTo(HaveOccurred()) | ||
|
|
||
| k8sClient.DeleteAllOf(ctx, &netattdefv1.NetworkAttachmentDefinition{}, client.InNamespace(namespace)) | ||
| ExpectWithOffset(1, err).NotTo(HaveOccurred()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also wait for both objects to be gone from the kubernetes api ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I put the delete and list calls in an eventually block
8e133f4 to
d260c44
Compare
| ctx := context.Background() | ||
| EventuallyWithOffset(1, func(g Gomega) { | ||
| err := k8sClient.DeleteAllOf(ctx, &sriovnetworkv1.SriovNetwork{}, client.InNamespace(namespace)) | ||
| g.Expect(1, err).NotTo(HaveOccurred()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
g.Expect(err) ? here and below
|
lgtm, lets have UT pass |
SriovNetwork,IbSriovNetwork,OVSNetwork object (XNetwork resources from now on) must be created in the operator's namespace and the `.Spec.NetworkNamespace` field defines where the controller should create the NetworkAttachmentDefinition resource. This constraint can be a problem in clusters where applications are managed by non cluster administrators. These changes makes the `genericNetworkReconciler` to accept XNetwork resources in namespaces different than the operator's one. In such cases, the field `.Spec.NetworkNamespace` must be empty, and a validating webhook ensure this constraint. Signed-off-by: Andrea Panattoni <[email protected]>
Signed-off-by: Andrea Panattoni <[email protected]>
to avoid panic:
```
panic: /tmp/go-build1586768671/b001/exe/webhook flag redefined: alsologtostderr
goroutine 1 [running]:
flag.(*FlagSet).Var(0xc0000781c0, {0x3daf2d0, 0x5566969}, {0x394a710, 0xf}, {0x39e87ee, 0x49})
/usr/lib/golang/src/flag/flag.go:1029 +0x3b9
k8s.io/klog/v2.InitFlags.func1(0xc000234990?)
/home/apanatto/go/pkg/mod/k8s.io/klog/[email protected]/klog.go:447 +0x31
flag.(*FlagSet).VisitAll(0xc0003f4c40?, 0xc000785de0)
/usr/lib/golang/src/flag/flag.go:458 +0x42
k8s.io/klog/v2.InitFlags(0x3dc1aa0?)
/home/apanatto/go/pkg/mod/k8s.io/klog/[email protected]/klog.go:446 +0x3c
main.init.0()
/home/apanatto/dev/github.com/k8snetworkplumbingwg/sriov-network-operator/cmd/webhook/main.go:27 +0x15
exit status 2
```
Signed-off-by: Andrea Panattoni <[email protected]>
Move `[sriov] operator No SriovNetworkNodePolicy ...` test cases to its own test file. Implement test case to verify controller and webhook logic Signed-off-by: Andrea Panattoni <[email protected]>
d260c44 to
ada185c
Compare
adrianchiris
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, feel free to merge this one @zeeke !
SriovNetwork,IbSriovNetwork,OVSNetwork object (XNetwork resources from now on) must be created in the operator's namespace and the
.Spec.NetworkNamespacefield defines where thecontroller should create the NetworkAttachmentDefinition resource. This constraint can be a problem
in clusters where applications are managed by non cluster administrators.
These changes makes the
genericNetworkReconcilerto accept XNetwork resources in namespaces different thanthe operator's one. In such cases, the field
.Spec.NetworkNamespacemust be empty, and a validating webhookensure this constraint.
Webhook validation for networks .Spec.NetworkNamespace field
Move
[sriov] operator No SriovNetworkNodePolicy ...test cases to itsown test file.
Implement test case to verify controller and webhook logic