Skip to content

Conversation

@mazdakn
Copy link
Member

@mazdakn mazdakn commented Nov 28, 2025

Description

Re-create (by swapping) an ipset that was failed during re-syncing with dataplane. It includes:

Introduce a new field to dataplaneMetadata called ListFailed to store if listing an ipsets is successful or not.
If listing an ipset fails, set ListFailure to true.
If listing an ipset fails several times, try to re-create that ipset by swapping it out with one created from desired state.

Related issues/PRs

Pick of #11340
GH issue #11051

Todos

  • Tests
  • Documentation
  • Release note

Release Note

Re-create and swap out Calico ipsets that are not possible to list due to different failures like user-space/kernel incompatibility.

Reminder for the reviewer

Make sure that this PR has the correct labels and milestone set.

Every PR needs one docs-* label.

  • docs-pr-required: This change requires a change to the documentation that has not been completed yet.
  • docs-completed: This change has all necessary documentation completed.
  • docs-not-required: This change has no user-facing impact and requires no docs.

Every PR needs one release-note-* label.

  • release-note-required: This PR has user-facing changes. Most PRs should have this label.
  • release-note-not-required: This PR has no user-facing changes.

Other optional labels:

  • cherry-pick-candidate: This PR should be cherry-picked to an earlier release. For bug fixes only.
  • needs-operator-pr: This PR is related to install and requires a corresponding change to the operator.

@mazdakn mazdakn requested a review from a team as a code owner November 28, 2025 19:19
Copilot AI review requested due to automatic review settings November 28, 2025 19:19
@mazdakn mazdakn added release-note-required Change has user-facing impact (no matter how small) docs-not-required Docs not required for this change labels Nov 28, 2025
@marvin-tigera marvin-tigera added this to the Calico v3.31.2 milestone Nov 28, 2025
Copilot finished reviewing on behalf of mazdakn November 28, 2025 19:22
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses issue #11051 by implementing a mechanism to re-create IPSets that fail during resync operations due to version incompatibilities (e.g., userspace/kernel revision mismatch). The solution introduces a ListFailed field to track listing failures and adjusts the retry logic to allow recreation by swapping after persistent failures.

Key changes:

  • Adds ListFailed field to dataplaneMetadata to track when ipset listing fails
  • Introduces MaxRetryAttempt constant and modifies retry logic to distinguish transient vs persistent failures
  • Failed ipsets in desired state are tracked and re-created via swap mechanism after multiple retry attempts
  • Enhanced error handling to continue processing other ipsets when one fails to parse

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
felix/ipsets/ipsets.go Core logic changes: adds ListFailed field, implements retry logic with transient/persistent failure distinction, updates resync error handling to track failed ipsets and trigger recreation via swap
felix/ipsets/ipsets_test.go Adds test case for ipsets with unsupported revisions, updates existing test to use new MaxRetryAttempt constant
felix/ipsets/utils_for_test.go Test infrastructure: adds supportedMockRevision constant and Revision field to mock metadata, implements revision checking in mock list command to simulate version incompatibility

Comment on lines +484 to +490
if desired {
s.logCxt.WithError(err).WithField("name", name).
Warn("Failed to parse required Calico-owned ipset that is needed, will try recreating it.")
} else {
s.logCxt.WithError(err).WithField("name", name).
Warn("Failed to parse Calico-owned ipset that is no longer needed, will queue it for deletion.")
}
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate condition check on line 484. The condition if desired is already checked on line 481, making this second check redundant. This appears to be a copy-paste error where the logging was intended to be inside the first if desired block.

Copilot uses AI. Check for mistakes.
@mazdakn mazdakn merged commit 02005a2 into projectcalico:release-v3.31 Nov 28, 2025
10 of 12 checks passed
@mazdakn mazdakn deleted the pick-pr-11340 branch November 28, 2025 22:59
@danudey danudey modified the milestones: Calico v3.31.2, Calico v3.31.3 Dec 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-not-required Docs not required for this change release-note-required Change has user-facing impact (no matter how small)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants