Skip to content

Cluster-Deleting stuck in ResourceBinding index-lack #6886

@Chen-speculation

Description

@Chen-speculation

Issue Description

When using Karmada v1.14.0, cluster deletion via karmadactl unjoin command fails due to finalizer blocking. The cluster cannot be successfully removed.

Symptoms

  1. Execute deletion command:

    karmadactl unjoin slav --cluster-kubeconfig=/Users/cccmmmdd/obcloud-operator/.vscode/kubeconfig/slave --cluster-context=300966637841247779-cb6ea982f5b0442fead0d5401194847b4 --kubeconfig=/Users/cccmmmdd/obcloud-operator/.vscode/kubeconfig/karmada --karmada-context=karmada-apiserver
  2. Cluster status shows deletion is blocked:

    kubectl get cluster slav -o yaml --kubeconfig=/Users/cccmmmdd/obcloud-operator/.vscode/kubeconfig/karmada

    Shows cluster has deletionTimestamp but finalizer not removed.

  3. Key error in Karmada controller logs:

    E1029 02:50:16.498769       1 controller.go:347] "Reconciler error" err="Index with name field:ResourceBindingIndexByFieldCluster does not exist" controller="cluster-controller" controllerGroup="cluster.karmada.io" controllerKind="Cluster" Cluster="slav" namespace="" name="slav" reconcileID="9a80e7d9-3ab9-44c6-93c0-7c5ca35eb6e5"
    
  4. Recurring error logs:

    "Failed to list ResourceBindings" err="Index with name field:ResourceBindingIndexByFieldCluster does not exist" cluster="slav"
    "Failed to check target cluster is removed from all bindings" err="Index with name field:ResourceBindingIndexByFieldCluster does not exist" cluster="slav"
    

Environment

  • Karmada Version: v1.14.0-amd64
  • Kubernetes Version: v1.33.3-aliyun.1
  • Deployment Method: Helm Chart
  • Controller Configuration: Default configuration, PropagateDeps feature disabled

Root Cause Analysis

When processing cluster deletion, the Karmada controller needs to check if any ResourceBindings still reference the cluster. The controller attempts to use an index named ResourceBindingIndexByFieldCluster to efficiently find related resources, but this index was not properly initialized or was lost at runtime, preventing the controller from completing the cleanup check before cluster deletion.

Impact Scope

  • All clusters attempted to be deleted via karmadactl unjoin encounter this issue
  • Cluster objects are marked for deletion but cannot complete, leading to resource leaks
  • Affects normal maintenance operations of the Karmada control plane

Temporary Workarounds

  1. Manually remove finalizer (higher risk, recommended only for test environments):

    kubectl patch cluster <cluster-name> -p '{"metadata":{"finalizers":[]}}' --type=merge --kubeconfig=<karmada-kubeconfig>
  2. Restart controller (may fix indexing issue):

    kubectl rollout restart deployment karmada-controller-manager -n karmada-system --kubeconfig=<master-kubeconfig>
  3. Check and clean residual resources:

    # Check for residual ResourceBindings
    kubectl get resourcebindings -A --kubeconfig=<karmada-kubeconfig>
    
    # Check for residual Work resources
    kubectl get works -A --kubeconfig=<karmada-kubeconfig>

Metadata

Metadata

Labels

kind/bugCategorizes issue or PR as related to a bug.

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions