Skip to content

Improve logging in Agent's NetworkPolicy controller #7455

@antoninbas

Description

@antoninbas

While troubleshooting an issue, I observed a lot of Reconciling Pod NetworkPolicy rule log messages. While I do think these log messages can be useful for troubleshooting, I also think it is too verbose for a lot of production clusters. It is fairly common to have policies where the address group (for example) includes all the Pods in the cluster. Every time a Pod is created or deleted, the aggress group is updated, which triggers a rule reconciliation on all the Nodes on which the policy is applied.
Therefore I would like to propose that the verbosity be set to V(1) for this log message instead of the default V(0). The log level can easily be increased to 1 for a given Agent when troubleshooting (using antctl).


I also would like to improve the following log message:

I0918 19:19:52.581136       1 pod_reconciler.go:1008] "Forgetting rule" rule="43f0f5f6eb0d62bf"

First, I would like to also set the verbosity to V(1) for this log. Then, I would like to add more context to it so that the policy reference is included (just like for the corresponding Reconciling log message).
Finally, we should not log this message for rules which were not previously realized, as it can lead to confusion and users will observe the same log message logged repeatedly. This is a pretty common situation when using per-rule appliedTo, as an Agent may receive a policy because one of the rules is applied locally, while other rules may not be applied locally at all. Rule which are not applied locally will have an empty appliedto group and will therefore never be realized. Ultimately, this is because we distribute from the Controller to Agents at the policy level, not at the rule level.

Metadata

Metadata

Assignees

Labels

area/network-policy/agentIssues or PRs related to the network policy agents.

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions