All statuses reported as healthy, but connectivety does not work

Hi! I'm running a fresh cluster with 1 node & 3 worker nodes. I'm seeing weird issues where my nodes just cannot communicate with each other... I've been banging my head against a wall for about 2 weeks no with barely any progress, so I'd really appreciate some help! My cluster is compromised of a bunch of VPSs, but mostly w/o the cloud infra. The control node has ipv4 & ipv6, the worker nodes only have ipv6 & a private (ipv4) network w/ the control node. As context, the control node is called `powerful-2`. For this issue I'll be using my vault deployment (which is not working) as the example.

Service that I'll be using:
```yaml
apiVersion: v1
kind: Service
metadata:
  annotations:
    meta.helm.sh/release-name: vault
    meta.helm.sh/release-namespace: vault
  creationTimestamp: "2025-11-23T22:28:58Z"
  labels:
    app.kubernetes.io/instance: vault
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: vault
    helm.sh/chart: vault-0.27.0
    vault-internal: "true"
  name: vault-internal
  namespace: vault
  resourceVersion: "3684"
  uid: 20382afc-a3a2-4e99-93db-5ba808eb2996
spec:
  clusterIP: None
  clusterIPs:
  - None
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv6
  ipFamilyPolicy: SingleStack
  ports:
  - name: https
    port: 8200
    protocol: TCP
    targetPort: 8200
  - name: https-internal
    port: 8201
    protocol: TCP
    targetPort: 8201
  publishNotReadyAddresses: true
  selector:
    app.kubernetes.io/instance: vault
    app.kubernetes.io/name: vault
    component: server
  sessionAffinity: None
  type: ClusterIP
```

Pods
```
# kubectl get po -owide
vault-0                                1/2     Running   0               5h27m   10.244.105.136                     weak-1       <none>           <none>
vault-1                                1/2     Running   0               5h27m   10.244.217.199                     weak-2       <none>           <none>
vault-2                                1/2     Running   0               5h27m   fd40:10:200:0:4a36:85f5:61d4:897   powerful-2   <none>           <none>
```
Note that the pods ARE connectable even though they are not ready.

Also note that the pods seem to have both v4 & v6 assigned to them:
```yaml
metadata:
  annotations:
    cni.projectcalico.org/containerID: 18a5ae1ba0417a0a69b1765d10e57e1bcd73903e0af3af56702933fa256b24cb
    cni.projectcalico.org/podIP: 10.244.105.136/32
    cni.projectcalico.org/podIPs: 10.244.105.136/32,fd40:10:200:0:a6f7:5e60:d54c:70c7/128
    kubectl.kubernetes.io/restartedAt: "2025-11-24T14:05:25Z"
  name: vault-0
status:
  hostIP: 2a01:--node-ip--
  hostIPs:
    - ip: 2a01:--node-ip--
  podIP: 10.244.105.136
  podIPs:
    - ip: 10.244.105.136
    - ip: fd40:10:200:0:a6f7:5e60:d54c:70c7
---
metadata:
  annotations:
    cni.projectcalico.org/containerID: 13ff4b5c2905b4ca67ee27ac55673dea75cd249b704d5841e5f88945697bbaa7
    cni.projectcalico.org/podIP: 10.244.217.199/32
    cni.projectcalico.org/podIPs: 10.244.217.199/32,fd40:10:200:0:9e5:9a66:f74e:d9c6/128
    kubectl.kubernetes.io/restartedAt: "2025-11-24T14:05:25Z"
  name: vault-1
status:
  hostIP: 2a01:--node-ip--
  hostIPs:
    - ip: 2a01:--node-ip--
  podIP: 10.244.217.199
  podIPs:
    - ip: 10.244.217.199
    - ip: fd40:10:200:0:9e5:9a66:f74e:d9c6
---
metadata:
  annotations:
    cni.projectcalico.org/containerID: 4a006f9d49295ea4c44c5ff718486cb27a36b171e5041e6013b4d59d1b70b33e
    cni.projectcalico.org/podIP: 10.244.8.152/32
    cni.projectcalico.org/podIPs: 10.244.8.152/32,fd40:10:200:0:4a36:85f5:61d4:897/128
    kubectl.kubernetes.io/restartedAt: "2025-11-24T14:05:25Z"
  name: vault-2
status:
  hostIP: 2a01:--node-ip--
  hostIPs:
    - ip: 2a01:--node-ip--
    - ip: 10.0.0.2
  podIP: fd40:10:200:0:4a36:85f5:61d4:897
  podIPs:
    - ip: fd40:10:200:0:4a36:85f5:61d4:897
    - ip: 10.244.8.152
```

The test will be to run `ping6` against `vault-X.vault-internal` in every pod, which are all on different nodes

## Expected Behavior
I expect that all pods in all nodes are able to communicate to each other equally

## Current Behavior
`vault-2` (the one which on the control node) is able to resolve all pod's addresses, but is only able to ping itself
`vault-0` & `vault-1` are not even able to resolve the other pod's addresses

## Steps to Reproduce (for bugs)
n/a, see the headerless part of the description

## Context
This is a real big issue for me - no connecting is working! I can't really deploy anything until this is resolved

Additional context is that I see these weird errors in the `calico-node` pods on all nodes other than the control node:

```
2025-11-24 20:02:11.340 [WARNING][63] felix/client.go 175: Failed to connect to flow server error=rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [fd40:10:100::43e6]:7443: i/o timeout" target="dns:///goldmane.calico-system.svc:7443"
2025-11-24 20:02:11.341 [INFO][63] felix/client.go 251: Waiting before next connection attempt duration=10s
2025-11-24 20:02:21.342 [WARNING][63] felix/client.go 175: Failed to connect to flow server error=rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [fd40:10:100::43e6]:7443: i/o timeout" target="dns:///goldmane.calico-system.svc:7443"
2025-11-24 20:02:21.342 [INFO][63] felix/client.go 251: Waiting before next connection attempt duration=10s
2025-11-24 20:02:22.899 [WARNING][63] felix/client.go 224: Flow client buffer full, dropping flow
2025-11-24 20:02:31.343 [WARNING][63] felix/client.go 175: Failed to connect to flow server error=rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [fd40:10:100::43e6]:7443: i/o timeout" target="dns:///goldmane.calico-system.svc:7443"
2025-11-24 20:02:31.343 [INFO][63] felix/client.go 251: Waiting before next connection attempt duration=10s
```

## Your Environment
* Calico version: v3.31.2
* Calico dataplane: iptables
* Orchestrator version (e.g. kubernetes, openshift, etc.): kubernetes
* Operating System and version: `6.12.57+deb13-cloud-amd64`

Installed via tigera operator

Thanks for yall's time in advance & I hope this can get resolved!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

All statuses reported as healthy, but connectivety does not work #11439

Expected Behavior

Current Behavior

Steps to Reproduce (for bugs)

Context

Your Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

All statuses reported as healthy, but connectivety does not work #11439

Description

Expected Behavior

Current Behavior

Steps to Reproduce (for bugs)

Context

Your Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions