Skip to content

An operation with the given volume x is already in progress #2783

@seilerre

Description

@seilerre

/kind bug

What happened?

We are deploying our cluster using kops, so the EBS version is a bit old. Sorry for that! Every time we create a new cluster our strimzi deployment fails with a few replicas (1 or 2 out of 7) reporting MountVolume.MountDevice failed.

Problem: MountVolume.MountDevice failed for volume "pvc-f4d9245b-8e05-45e9-a6d3-c67a36b2bde1" : rpc error: code = Aborted desc = An operation with the given volume="vol-0ec38943edd360ec2" is already in progress

The volume is attached to the correct node just fine:

Image

The Pod is assigned to `nodeName: i-04a771ff5fadd43fb``

Seems reasonable to look in Kubernetes for the problem

✦ ➜ kubectl get volumeattachment
NAME                                                                   ATTACHER          PV                                         NODE                  ATTACHED   AGE
csi-0115e55fc9e565c8ea85605d078d4a1c94cd02878d8e825f4b7737d996d5f931   ebs.csi.aws.com   pvc-d79555a8-f769-4d76-92e9-04c194702753   i-0d8ddc76ef940a784   true       6h7m
csi-51039995feb12d3abde3eb78137b12bcb3bc617c9032f31ead98fafb7204c9a6   ebs.csi.aws.com   pvc-208b446b-ec1f-4487-ac05-c9f5d14b40b5   i-0d6198b0b546dac69   true       6h7m
csi-5cf2c1b890e66aca0b2e3f9e753478b1b35e6d0dc140022ab314c439da9180a4   ebs.csi.aws.com   pvc-cb84f7b0-b8fd-4749-8bbd-6440412d6ab9   i-06d68249d4a7d0cb8   true       6h7m
csi-954a71cfaa8a33354320408ab19f0c2f3b88ca6b742a8522222242dcae1fc6f2   ebs.csi.aws.com   pvc-1bbed81d-86e6-4f34-ac1c-66dfc739a72e   i-06d68249d4a7d0cb8   true       6h7m
csi-a87059f2e70cd2f489687b91b2c7b67e4741ca7e2a9e0f8a35f9cf92dfc0d96f   ebs.csi.aws.com   pvc-6dc054e8-a09a-4011-8be7-3fba24761fe6   i-08a1e2a74f11872f0   true       6h7m
csi-ad2817e61aa04c7824b18bff188c2b99e6f82b637946f5ced965fb78ddb1a707   ebs.csi.aws.com   pvc-f56fee11-e15c-4029-9292-e38208fcedbe   i-08a1e2a74f11872f0   true       6h7m
csi-c6ba5d163cd17699ec1d27b391374573e96011470eb79f08575ee1a0aae8fbcb   ebs.csi.aws.com   pvc-f4d9245b-8e05-45e9-a6d3-c67a36b2bde1   i-04a771ff5fadd43fb   true       6h7m
csi-d7c979ec26d1b26f63e5d69e12b2b45d1b6ab4ef19a98b22c363ee067a6dd145   ebs.csi.aws.com   pvc-27dba443-9377-4e19-a8d4-7440b847ca99   i-0d6198b0b546dac69   true       6h7m
csi-e31029884739e2e203a0b284337331b0cec184aa5987aefea4c8ec7dffe62814   ebs.csi.aws.com   pvc-2c8259af-48e3-421d-9fae-7015ad622d31   i-023eef5364eb2146e   true       6h7m
✦ ➜ kubectl get volumeattachment -o json | jq '.items[] | select(.spec.source.persistentVolumeName=="pvc-f4d9245b-8e05-45e9-a6d3-c67a36b2bde1")'
{
  "apiVersion": "storage.k8s.io/v1",
  "kind": "VolumeAttachment",
  "metadata": {
    "annotations": {
      "csi.alpha.kubernetes.io/node-id": "i-04a771ff5fadd43fb"
    },
    "creationTimestamp": "2025-11-10T08:09:33Z",
    "finalizers": [
      "external-attacher/ebs-csi-aws-com"
    ],
    "name": "csi-c6ba5d163cd17699ec1d27b391374573e96011470eb79f08575ee1a0aae8fbcb",
    "resourceVersion": "13495",
    "uid": "4af77cfb-675e-4ea3-a02d-8033757895c4"
  },
  "spec": {
    "attacher": "ebs.csi.aws.com",
    "nodeName": "i-04a771ff5fadd43fb",
    "source": {
      "persistentVolumeName": "pvc-f4d9245b-8e05-45e9-a6d3-c67a36b2bde1"
    }
  },
  "status": {
    "attached": true,
    "attachmentMetadata": {
      "devicePath": "/dev/xvdaa"
    }
  }
}

EBS Pod is saying the following not too happy things :'(

✦ ➜ k logs -n kube-system ebs-csi-node-qv926 -f
Defaulted container "ebs-plugin" out of: ebs-plugin, node-driver-registrar, liveness-probe
I1110 07:45:09.250327       1 main.go:147] "Region provided via AWS_REGION environment variable" region="eu-central-1"
I1110 07:45:09.250381       1 main.go:149] "Node service requires metadata even if AWS_REGION provided, initializing metadata"
E1110 07:45:14.251672       1 metadata.go:51] "Retrieving IMDS metadata failed, falling back to Kubernetes metadata" err="could not get EC2 instance identity metadata: operation error ec2imds: GetInstanceIdentityDocument, canceled, context deadline exceeded"
I1110 07:45:14.258415       1 metadata.go:55] "Retrieved metadata from Kubernetes"
I1110 07:45:14.258734       1 driver.go:69] "Driver Information" Driver="ebs.csi.aws.com" Version="v1.38.1"
I1110 07:45:15.264842       1 node.go:936] "CSINode Allocatable value is set" nodeName="i-04a771ff5fadd43fb" count=26
I1110 08:09:39.645772       1 mount_linux.go:618] Disk "/dev/nvme1n1" appears to be unformatted, attempting to format as type: "ext4" with options: [-F -m0 /dev/nvme1n1]
I1110 08:09:46.975266       1 mount_linux.go:629] Disk successfully formatted (mkfs): ext4 - /dev/nvme1n1 /var/lib/kubelet/plugins/kubernetes.io/csi/ebs.csi.aws.com/ffeb8e27681e00386ca34317ead7f6aa8189a2f1c4b8080e94779e3a4029071a/globalmount
I1110 08:09:46.975369       1 mount_linux.go:295] Detected OS without systemd
E1110 08:11:40.216138       1 driver.go:108] "GRPC error" err="rpc error: code = Aborted desc = An operation with the given volume=\"vol-0ec38943edd360ec2\" is already in progress"
E1110 08:11:41.321444       1 driver.go:108] "GRPC error" err="rpc error: code = Aborted desc = An operation with the given volume=\"vol-0ec38943edd360ec2\" is already in progress"
E1110 08:11:43.333803       1 driver.go:108] "GRPC error" err="rpc error: code = Aborted desc = An operation with the given volume=\"vol-0ec38943edd360ec2\" is already in progress"
E1110 08:11:47.355531       1 driver.go:108] "GRPC error" err="rpc error: code = Aborted desc = An operation with the given volume=\"vol-0ec38943edd360ec2\" is already in progress"

Controller seems happy

✦ ❯ kubectl logs -n kube-system deployment/ebs-csi-controller -c ebs-plugin | \
  grep "vol-0ec38943edd360ec2" | tail -20
Found 2 pods, using pod/ebs-csi-controller-5569459555-mqkcv
I1110 08:09:33.632537       1 controller.go:397] "ControllerPublishVolume: called" args="volume_id:\"vol-0ec38943edd360ec2\" node_id:\"i-04a771ff5fadd43fb\" volume_capability:{mount:{fs_type:\"ext4\"} access_mode:{mode:SINGLE_NODE_WRITER}} volume_context:{key:\"storage.kubernetes.io/csiProvisionerIdentity\" value:\"1762760519115-7247-ebs.csi.aws.com\"}"
I1110 08:09:33.632643       1 controller.go:410] "ControllerPublishVolume: attaching" volumeID="vol-0ec38943edd360ec2" nodeID="i-04a771ff5fadd43fb"
I1110 08:09:34.763429       1 cloud.go:960] "[Debug] AttachVolume" volumeID="vol-0ec38943edd360ec2" nodeID="i-04a771ff5fadd43fb" resp={"AssociatedResource":null,"AttachTime":"2025-11-10T08:09:34.564Z","DeleteOnTermination":null,"Device":"/dev/xvdaa","InstanceId":"i-04a771ff5fadd43fb","InstanceOwningService":null,"State":"attaching","VolumeId":"vol-0ec38943edd360ec2","ResultMetadata":{}}
I1110 08:09:35.251660       1 manager.go:190] "[Debug] Releasing in-process" attachment entry="/dev/xvdaa" volume="vol-0ec38943edd360ec2"
I1110 08:09:35.251674       1 controller.go:419] "ControllerPublishVolume: attached" volumeID="vol-0ec38943edd360ec2" nodeID="i-04a771ff5fadd43fb" devicePath="/dev/xvdaa"
I1110 08:09:35.251695       1 inflight.go:74] "Node Service: volume operation finished" key="vol-0ec38943edd360ec2i-04a771ff5fadd43fb"

I tried to delete the relevant Pod ebs-csi-node-qv926, but it is stuck in terminating state. Before doing anything more drastic I wanted to ask here for advice.

What you expected to happen?

The volume is attached and the Pod mounts the PVC

How to reproduce it (as minimally and precisely as possible)?

Difficult to reproduce, not sure how

Anything else we need to know?:

Environment
Client Version: v1.33.3
Kustomize Version: v5.6.0
Server Version: v1.33.5

aws-ebs-csi-driver:v1.38.1,

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions