Skip to content

Conversation

@zeeke
Copy link
Member

@zeeke zeeke commented May 12, 2025

SriovNetwork,IbSriovNetwork,OVSNetwork object (XNetwork resources from now on) must be created in the operator's namespace and the .Spec.NetworkNamespace field defines where the
controller should create the NetworkAttachmentDefinition resource. This constraint can be a problem
in clusters where applications are managed by non cluster administrators.

These changes makes the genericNetworkReconciler to accept XNetwork resources in namespaces different than
the operator's one. In such cases, the field .Spec.NetworkNamespace must be empty, and a validating webhook
ensure this constraint.

Webhook validation for networks .Spec.NetworkNamespace field

Move [sriov] operator No SriovNetworkNodePolicy ... test cases to its
own test file.

Implement test case to verify controller and webhook logic

@github-actions
Copy link

Thanks for your PR,
To run vendors CIs, Maintainers can use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs, Maintainers can use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@zeeke zeeke force-pushed the us/namespaced-networks branch 2 times, most recently from 000da88 to 53198d5 Compare May 12, 2025 12:49
@coveralls
Copy link

coveralls commented May 12, 2025

Pull Request Test Coverage Report for Build 17231697992

Details

  • 73 of 141 (51.77%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.1%) to 61.864%

Changes Missing Coverage Covered Lines Changed/Added Lines %
controllers/generic_network_controller.go 68 79 86.08%
pkg/webhook/validate_networks.go 5 23 21.74%
pkg/webhook/webhook.go 0 39 0.0%
Totals Coverage Status
Change from base Build 16749504687: -0.1%
Covered Lines: 8656
Relevant Lines: 13992

💛 - Coveralls

@zeeke zeeke force-pushed the us/namespaced-networks branch 2 times, most recently from a169538 to cd2efd9 Compare May 13, 2025 10:34
@zeeke zeeke force-pushed the us/namespaced-networks branch from cd2efd9 to fe1f3a0 Compare May 30, 2025 14:17
@SchSeba SchSeba requested review from SchSeba and Copilot June 10, 2025 13:07
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances the support for namespaced XNetwork resources by allowing network objects (SriovNetwork, SriovIBNetwork, OVSNetwork) to be created in namespaces other than the operator’s, enforcing constraints via a validating webhook and related tests. Key changes include:

  • Refactoring network node policy tests by moving them to a dedicated file.
  • Adding webhook validation and tests for the .Spec.NetworkNamespace field on network objects.
  • Updating controller reconciliation logic to handle resources in non-operator namespaces and improve finalizer handling.
  • Minor adjustments in logging and manifests for webhook configuration.

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
test/conformance/tests/test_sriov_operator.go Removed redundant “No SriovNetworkNodePolicy” tests to be replaced by a separate file.
test/conformance/tests/test_no_policy.go Added dedicated tests for validating network policies and namespace constraints.
pkg/webhook/webhook.go Implemented validation cases for the three network types using json.Unmarshal.
pkg/webhook/validate_networks*.go, validate_networks_test.go Introduced functions and tests for enforcing .Spec.NetworkNamespace constraints.
controllers/generic_network_controller.go Updated reconciliation logic to fetch objects from both operator and resource namespaces, update finalizers appropriately, and set owner references when applicable.
controllers/sriovnetwork_controller_test.go Added new test cases to verify proper behavior when network objects are created in non-operator namespaces.
cmd/webhook/main.go & bindata/manifests/operator-webhook/003-webhook.yaml Adjusted logging initialization and expanded webhook configuration for new resource types.
Comments suppressed due to low confidence (1)

test/conformance/tests/test_no_policy.go:182

  • The parameter name 'webookEnabled' appears to be a typo; consider renaming it to 'webhookEnabled' for improved clarity.
DescribeTable("should gracefully restart quickly", func(webookEnabled bool) {

Copy link
Collaborator

@SchSeba SchSeba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work!

I left some small comments

operatorNamespacedName := req.NamespacedName
operatorNamespacedName.Namespace = vars.Namespace

err = r.Get(ctx, operatorNamespacedName, instance)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need to also check in the operator namespace?
can you please add a comment to explain why so I will remember next time I look at this code :P

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is because we reconcile NetworkAttachmentDefinitions updates, and here we have to fetch the source object.

the cases are

|| Reconciled object   || source object ||
| netAttachDef(nsX/net1) | SriovNetwork(operatorNs/net1) with networkNamespace: nsX |
| netAttachDef(nsX/net1) | SriovNetwork(nsX/net1) |
| SriovNetwork(operatorNs/net1) | SriovNetwork(operatorNs/net1)  |
| SriovNetwork(nsX/net1)  | SriovNetwork(nsX/net1) |

adding a comment

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we handle this case in an eventHandler registered in SetupWithManager ?

so instead of using handler.EnqueuRequestForObject{} we would implement our own with handler.EnqueueRequestFromMapFn there we can try to get the "source" object from current net-attach-def namespace or from the operator ns.

then Reconcile would get the correct object it needs to reconcile, WDYT ?


func validateNetworkNamespace(cr controllers.NetworkCRInstance) error {
if cr.GetNamespace() != vars.Namespace && cr.NetworkNamespace() != "" {
return fmt.Errorf(".Spec.NetworkNamespace field can't be specified if the resource is not in the %s namespace", vars.Namespace)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also do we want to block the creation there is already a network from the operator namespace with the same name?

or if there already a net-attach-def on the requested namespace?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a little bit debatable: having two or more objects contending the same NetAttachDef is handled in

It's the same problem, viewed from the perspective of the webhook, and we already have the same with an SriovNetwork + SriovIBNetwork pointing to the same object.

I prefer to address it in #897 or in another subsequent PR, after discussing in a community meeting. WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

works for me

})
})

DescribeTable("should gracefully restart quickly", func(webookEnabled bool) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the one that I removed so can you leave it in the original place?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To leave it where it was, I have to put back all the No SriovNetworkPolicy test tree. We'll get a conflict in any case and I'll solve it. Let's merge

before this

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now that 901 is merged I think we can remove this one

})

Context("Namespaced network objects", func() {
BeforeAll(func() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you remove this empty one?

)
})
})
})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a check here that if network in the operator namespace exist it should win always ones created by the user on it's own namespace

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case is handled in

but no such priority has been implemented yet. That PR handles these collisions by making the first object win.

for example, if a namespaced network is created, then the created netattachdef has the owner annotation. When the operatornamespaced network is created, it finds the target netattachdefs already owned by another object and the controller does nothing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

@zeeke zeeke force-pushed the us/namespaced-networks branch from fe1f3a0 to 1706fc6 Compare June 13, 2025 10:27
@zeeke zeeke requested a review from SchSeba June 25, 2025 14:45
defer func(previous string) { vars.Namespace = previous }(vars.Namespace)
vars.Namespace = "operator-namespace"

validNetworkNamespaces := []controllers.NetworkCRInstance{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any chance to have a "[]tcase" (list of test case struct) which contains the test name, the NetworkCRInstance and a shouldFail boolean ? then the test would iterate over entries , run validateNetworkNamespace and compare to expected result ?

if instance.NetworkNamespace() != "" && instance.GetNamespace() != vars.Namespace {
reqLogger.Error(
fmt.Errorf("bad value for NetworkNamespace"),
".Spec.NetworkNamespace can't be specified if the resource belongs to a namespace other than the operator's",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit do we want to specify "jsonpath" style field refs ?

e.g .spec.networkNamespace, .metadata.namespace ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, that would be better for the user point of view

operatorNamespacedName := req.NamespacedName
operatorNamespacedName.Namespace = vars.Namespace

err = r.Get(ctx, operatorNamespacedName, instance)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we handle this case in an eventHandler registered in SetupWithManager ?

so instead of using handler.EnqueuRequestForObject{} we would implement our own with handler.EnqueueRequestFromMapFn there we can try to get the "source" object from current net-attach-def namespace or from the operator ns.

then Reconcile would get the correct object it needs to reconcile, WDYT ?

Copy link
Collaborator

@SchSeba SchSeba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from my side LGTM.

to me it's not an hard request to have the object retreval in the handler, but it's nice to have.

@zeeke zeeke force-pushed the us/namespaced-networks branch from 1706fc6 to d38c8dc Compare July 7, 2025 09:49
@zeeke
Copy link
Member Author

zeeke commented Jul 7, 2025

I added the handleNetAttachDef function and it actually made the logic a little straightforward.

@adrianchiris @SchSeba please take a look

Copy link
Collaborator

@SchSeba SchSeba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I link this new approach!

I learned something new :)

LGTM

@SchSeba
Copy link
Collaborator

SchSeba commented Jul 22, 2025

Hi @zeeke,

our QE did the following test on this PR

  • Create sriovNetwork/net1 in ns1
  • Operator creates NAD/net1 in ns1
  • Update sriovNetwork/net1 with new resourceName
  • NAD/net1 is also updated (but no changes in spec as expected)

And on the last step when he check the NAD/net1 the owner reference was removed. So we probably have an issue with the update.

return reconcile.Result{}, nil
}

if instance.NetworkNamespace() != "" && instance.GetNamespace() != vars.Namespace {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to add error field into *NetworkStatus objects?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good idea, but I would discuss it after

Maybe the conditions on CRDs are enough. What do you think?

@zeeke
Copy link
Member Author

zeeke commented Jul 30, 2025

Hi @zeeke,

our QE did the following test on this PR

  • Create sriovNetwork/net1 in ns1
  • Operator creates NAD/net1 in ns1
  • Update sriovNetwork/net1 with new resourceName
  • NAD/net1 is also updated (but no changes in spec as expected)

And on the last step when he check the NAD/net1 the owner reference was removed. So we probably have an issue with the update.

fixed.

@adrianchiris @e0ne @SchSeba please, take another look

@zeeke zeeke force-pushed the us/namespaced-networks branch 2 times, most recently from b61c76b to c0347db Compare August 5, 2025 13:11
@zeeke zeeke requested review from adrianchiris and e0ne August 11, 2025 08:57
@zeeke
Copy link
Member Author

zeeke commented Aug 18, 2025

@adrianchiris @e0ne can you please take another look at this PR?

Copy link
Collaborator

@adrianchiris adrianchiris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final nits otherwise LGTM

// Request object not found, could have been deleted after reconcile request.
// Owned objects are automatically garbage collected. For additional cleanup logic use finalizers.
// Return and don't requeue
reqLogger.Error(err, "XXX2")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove debug prints ::)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure! sorry for that leftover

// Request object not found, could have been deleted after reconcile request.
// Owned objects are automatically garbage collected. For additional cleanup logic use finalizers.
// Return and don't requeue
reqLogger.Info("XXX3")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove debug prints ::)

Expect(k8sClient.Create(context.Background(), nsBlue)).ToNot(HaveOccurred())
DeferCleanup(func() {
By("deleting ns-blue")
err := k8sClient.Delete(ctx, nsBlue)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with envtest you cant really delete namespace, it will just stay dangling.

https://book.kubebuilder.io/reference/envtest#namespace-usage-limitation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know that, thanks!

I added a cleanNetworksInNamespace function to remove all the resources after the test


Eventually(func(g Gomega) {
netAttDef = &netattdefv1.NetworkAttachmentDefinition{}
err = util.WaitForNamespacedObject(netAttDef, k8sClient, "ns-blue", cr.GetName(), util.RetryInterval, util.Timeout)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: no need to call this wait func, object already exists, you can just call g.Expect(dynclient.Get(....)).To(Succeed())
its all wrapped in Eventually block anyway


// Delete the SriovNetwork
err = k8sClient.Delete(ctx, &cr)
Expect(err).NotTo(HaveOccurred())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets wait for the underlying net-attach-def to be deleted ?

Should(Succeed())
})

Context("When the SriovNetwork namespace is not equal to the operator one", func() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a case where the user requests to create nad in a different ns than the current one ?
then check nad doesnt get created with a Consistently block ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added!

instance := r.controller.GetObject()
nadNamespacedName := types.NamespacedName{Namespace: obj.GetNamespace(), Name: obj.GetName()}

log.Log.WithName("XXX").Info("handleNetAttDef", "nadNamespacedName", nadNamespacedName, "vars.Namespace", vars.Namespace)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove debug logs (XXX/XXX11 etc)

@zeeke zeeke force-pushed the us/namespaced-networks branch from c0347db to 8e133f4 Compare August 22, 2025 10:59
ExpectWithOffset(1, err).NotTo(HaveOccurred())

k8sClient.DeleteAllOf(ctx, &netattdefv1.NetworkAttachmentDefinition{}, client.InNamespace(namespace))
ExpectWithOffset(1, err).NotTo(HaveOccurred())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also wait for both objects to be gone from the kubernetes api ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, I put the delete and list calls in an eventually block

@zeeke zeeke force-pushed the us/namespaced-networks branch from 8e133f4 to d260c44 Compare August 25, 2025 08:34
ctx := context.Background()
EventuallyWithOffset(1, func(g Gomega) {
err := k8sClient.DeleteAllOf(ctx, &sriovnetworkv1.SriovNetwork{}, client.InNamespace(namespace))
g.Expect(1, err).NotTo(HaveOccurred())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

g.Expect(err) ? here and below

@adrianchiris
Copy link
Collaborator

lgtm, lets have UT pass

zeeke added 4 commits August 26, 2025 09:52
SriovNetwork,IbSriovNetwork,OVSNetwork object (XNetwork resources from now on) must be created in the operator's namespace and the `.Spec.NetworkNamespace` field defines where the
controller should create the NetworkAttachmentDefinition resource. This constraint can be a problem
in clusters where applications are managed by non cluster administrators.

These changes makes the `genericNetworkReconciler` to accept XNetwork resources in namespaces different than
the operator's one. In such cases, the field `.Spec.NetworkNamespace` must be empty, and a validating webhook
ensure this constraint.

Signed-off-by: Andrea Panattoni <[email protected]>
to avoid panic:
```
panic: /tmp/go-build1586768671/b001/exe/webhook flag redefined: alsologtostderr

goroutine 1 [running]:
flag.(*FlagSet).Var(0xc0000781c0, {0x3daf2d0, 0x5566969}, {0x394a710, 0xf}, {0x39e87ee, 0x49})
        /usr/lib/golang/src/flag/flag.go:1029 +0x3b9
k8s.io/klog/v2.InitFlags.func1(0xc000234990?)
        /home/apanatto/go/pkg/mod/k8s.io/klog/[email protected]/klog.go:447 +0x31
flag.(*FlagSet).VisitAll(0xc0003f4c40?, 0xc000785de0)
        /usr/lib/golang/src/flag/flag.go:458 +0x42
k8s.io/klog/v2.InitFlags(0x3dc1aa0?)
        /home/apanatto/go/pkg/mod/k8s.io/klog/[email protected]/klog.go:446 +0x3c
main.init.0()
        /home/apanatto/dev/github.com/k8snetworkplumbingwg/sriov-network-operator/cmd/webhook/main.go:27 +0x15
exit status 2
```

Signed-off-by: Andrea Panattoni <[email protected]>
Move `[sriov] operator No SriovNetworkNodePolicy ...` test cases to its
own test file.

Implement test case to verify controller and webhook logic

Signed-off-by: Andrea Panattoni <[email protected]>
@zeeke zeeke force-pushed the us/namespaced-networks branch from d260c44 to ada185c Compare August 26, 2025 07:52
Copy link
Collaborator

@adrianchiris adrianchiris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, feel free to merge this one @zeeke !

@zeeke zeeke merged commit 35e7319 into k8snetworkplumbingwg:master Aug 27, 2025
13 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants