-
Notifications
You must be signed in to change notification settings - Fork 33
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Description of the bug
Thank you for keep team! It looks as really great growing project!
So, the problem is:
Having IAM role with all needed permissions and k8s (eks) serviceaccount annotated with the role keep cannot actually access the cluster
Logs I get from keep-backend side using k8s serviceaccount annotated with IAM Role:
{"worker_type": "uvicorn", "asctime": "2025-05-02 06:11:22,676", "message": "Error validating Kubernetes API scopes", "levelname": "ERROR", "name": "eks", "filename": "eks_provider.py", "otelTraceID": "c24a71dcef056bccdacae15d379d54f9", "otelSpanID": "3179b997e36e1734", "otelTraceSampled": true, "otelServiceName": "keep-api", "threadName": "MainThread", "process": 18, "module": "eks_provider", "exc_info": "Traceback (most recent call last):\n File \"/venv/lib/python3.11/site-packages/keep/providers/eks_provider/eks_provider.py\", line 600, in __generate_client\n cluster_info = eks_client.describe_cluster(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/venv/lib/python3.11/site-packages/botocore/client.py\", line 569, in _api_call\n return self._make_api_call(operation_name, kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/venv/lib/python3.11/site-packages/botocore/client.py\", line 1023, in _make_api_call\n raise error_class(parsed_response, operation_name)\nbotocore.exceptions.ClientError: An error occurred (UnrecognizedClientException) when calling the DescribeCluster operation: The security token included in the request is invalid.\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/venv/lib/python3.11/site-packages/keep/providers/eks_provider/eks_provider.py\", line 243, in validate_scopes\n k8s_client = self.client # This will initialize connection to cluster\n ^^^^^^^^^^^\n File \"/venv/lib/python3.11/site-packages/keep/providers/eks_provider/eks_provider.py\", line 329, in client\n self._client = self.__generate_client()\n ^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/venv/lib/python3.11/site-packages/keep/providers/eks_provider/eks_provider.py\", line 634, in __generate_client\n raise ProviderException(f\"Failed to generate EKS client: {e}\")\nkeep.exceptions.provider_exception.ProviderException: Failed to generate EKS client: An error occurred (UnrecognizedClientException) when calling the DescribeCluster operation: The security token included in the request is invalid."}
{"worker_type": "uvicorn", "asctime": "2025-05-02 06:11:22,679", "message": "Completed scope validation", "levelname": "INFO", "name": "eks", "filename": "eks_provider.py", "otelTraceID": "c24a71dcef056bccdacae15d379d54f9", "otelSpanID": "3179b997e36e1734", "otelTraceSampled": true, "otelServiceName": "keep-api", "threadName": "MainThread", "process": 18, "module": "eks_provider"}
{"worker_type": "uvicorn", "asctime": "2025-05-02 06:11:22,682", "message": "Failed to validate mandatory provider scopes", "levelname": "WARNING", "name": "keep.providers.providers_service", "filename": "providers_service.py", "otelTraceID": "c24a71dcef056bccdacae15d379d54f9", "otelSpanID": "3179b997e36e1734", "otelTraceSampled": true, "otelServiceName": "keep-api", "threadName": "MainThread", "process": 18, "module": "providers_service", "validated_scopes": {"eks:DescribeCluster": "An error occurred (UnrecognizedClientException) when calling the DescribeCluster operation: The security token included in the request is invalid.", "eks:ListClusters": "An error occurred (UnrecognizedClientException) when calling the ListClusters operation: The security token included in the request is invalid.", "pods:delete": "Failed to generate EKS client: An error occurred (UnrecognizedClientException) when calling the DescribeCluster operation: The security token included in the request is invalid.", "deployments:scale": "Failed to generate EKS client: An error occurred (UnrecognizedClientException) when calling the DescribeCluster operation: The security token included in the request is invalid.", "pods:list": "Failed to generate EKS client: An error occurred (UnrecognizedClientException) when calling the DescribeCluster operation: The security token included in the request is invalid.", "pods:get": "Failed to generate EKS client: An error occurred (UnrecognizedClientException) when calling the DescribeCluster operation: The security token included in the request is invalid.", "pods:logs": "Failed to generate EKS client: An error occurred (UnrecognizedClientException) when calling the DescribeCluster operation: The security token included in the request is invalid."}}
{"worker_type": "uvicorn", "asctime": "2025-05-02 06:11:22,683", "message": "Failed to validate mandatory provider scopes, returning 412", "levelname": "ERROR", "name": "keep.api.routes.providers", "filename": "providers.py", "otelTraceID": "c24a71dcef056bccdacae15d379d54f9", "otelSpanID": "3179b997e36e1734", "otelTraceSampled": true, "otelServiceName": "keep-api", "threadName": "MainThread", "process": 18, "module": "providers", "provider_id": "eks", "provider_type": "eks", "tenant_id": "keep"}
And here is log I get connecting with all-the-same permissions but using IAM user key pair:
{"worker_type": "uvicorn", "asctime": "2025-05-02 06:30:33,350", "message": "Installing provider", "levelname": "INFO", "name": "keep.providers.providers_service", "filename": "providers_service.py", "otelTraceID": "7466f0274a16c46dea7013b0be8a584f", "otelSpanID": "7bbeada4b8f1709f", "otelTraceSampled": true, "otelServiceName": "keep-api", "threadName": "MainThread", "process": 27, "module": "providers_service", "provider_id": "eks", "provider_type": "eks", "tenant_id": "keep"}
{"worker_type": "uvicorn", "asctime": "2025-05-02 06:30:33,351", "message": "Starting EKS API permissions validation", "levelname": "INFO", "name": "eks", "filename": "eks_provider.py", "otelTraceID": "7466f0274a16c46dea7013b0be8a584f", "otelSpanID": "7bbeada4b8f1709f", "otelTraceSampled": true, "otelServiceName": "keep-api", "threadName": "MainThread", "process": 27, "module": "eks_provider"}
{"worker_type": "uvicorn", "asctime": "2025-05-02 06:30:33,356", "message": "Validating eks:ListClusters permission", "levelname": "INFO", "name": "eks", "filename": "eks_provider.py", "otelTraceID": "7466f0274a16c46dea7013b0be8a584f", "otelSpanID": "7bbeada4b8f1709f", "otelTraceSampled": true, "otelServiceName": "keep-api", "threadName": "MainThread", "process": 27, "module": "eks_provider"}
{"worker_type": "uvicorn", "asctime": "2025-05-02 06:30:33,425", "message": "eks:ListClusters permission validated successfully", "levelname": "INFO", "name": "eks", "filename": "eks_provider.py", "otelTraceID": "7466f0274a16c46dea7013b0be8a584f", "otelSpanID": "7bbeada4b8f1709f", "otelTraceSampled": true, "otelServiceName": "keep-api", "threadName": "MainThread", "process": 27, "module": "eks_provider"}
{"worker_type": "uvicorn", "asctime": "2025-05-02 06:30:33,425", "message": "Validating eks:DescribeCluster permission", "levelname": "INFO", "name": "eks", "filename": "eks_provider.py", "otelTraceID": "7466f0274a16c46dea7013b0be8a584f", "otelSpanID": "7bbeada4b8f1709f", "otelTraceSampled": true, "otelServiceName": "keep-api", "threadName": "MainThread", "process": 27, "module": "eks_provider"}
{"worker_type": "uvicorn", "asctime": "2025-05-02 06:30:33,561", "message": "eks:DescribeCluster permission validated successfully", "levelname": "INFO", "name": "eks", "filename": "eks_provider.py", "otelTraceID": "7466f0274a16c46dea7013b0be8a584f", "otelSpanID": "7bbeada4b8f1709f", "otelTraceSampled": true, "otelServiceName": "keep-api", "threadName": "MainThread", "process": 27, "module": "eks_provider"}
{"worker_type": "uvicorn", "asctime": "2025-05-02 06:30:33,561", "message": "Starting Kubernetes API permissions validation", "levelname": "INFO", "name": "eks", "filename": "eks_provider.py", "otelTraceID": "7466f0274a16c46dea7013b0be8a584f", "otelSpanID": "7bbeada4b8f1709f", "otelTraceSampled": true, "otelServiceName": "keep-api", "threadName": "MainThread", "process": 27, "module": "eks_provider"}
{"worker_type": "uvicorn", "asctime": "2025-05-02 06:30:33,746", "message": "Error validating Kubernetes API scopes", "levelname": "ERROR", "name": "eks", "filename": "eks_provider.py", "otelTraceID": "7466f0274a16c46dea7013b0be8a584f", "otelSpanID": "7bbeada4b8f1709f", "otelTraceSampled": true, "otelServiceName": "keep-api", "threadName": "MainThread", "process": 27, "module": "eks_provider", "exc_info": "Traceback (most recent call last):\n File \"/venv/lib/python3.11/site-packages/keep/providers/eks_provider/eks_provider.py\", line 626, in __generate_client\n \"users\": [{\"name\": \"aws_user\", \"user\": {\"token\": self.__get_token()}}],\n ^^^^^^^^^^^^^^^^^^\n File \"/venv/lib/python3.11/site-packages/keep/providers/eks_provider/eks_provider.py\", line 639, in __get_token\n from awscli.customizations.eks.get_token import STSClientFactory, TokenGenerator\nModuleNotFoundError: No module named 'awscli'\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/venv/lib/python3.11/site-packages/keep/providers/eks_provider/eks_provider.py\", line 243, in validate_scopes\n k8s_client = self.client # This will initialize connection to cluster\n ^^^^^^^^^^^\n File \"/venv/lib/python3.11/site-packages/keep/providers/eks_provider/eks_provider.py\", line 329, in client\n self._client = self.__generate_client()\n ^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/venv/lib/python3.11/site-packages/keep/providers/eks_provider/eks_provider.py\", line 634, in __generate_client\n raise ProviderException(f\"Failed to generate EKS client: {e}\")\nkeep.exceptions.provider_exception.ProviderException: Failed to generate EKS client: No module named 'awscli'"}
{"worker_type": "uvicorn", "asctime": "2025-05-02 06:30:33,748", "message": "Completed scope validation", "levelname": "INFO", "name": "eks", "filename": "eks_provider.py", "otelTraceID": "7466f0274a16c46dea7013b0be8a584f", "otelSpanID": "7bbeada4b8f1709f", "otelTraceSampled": true, "otelServiceName": "keep-api", "threadName": "MainThread", "process": 27, "module": "eks_provider"}
{"worker_type": "uvicorn", "asctime": "2025-05-02 06:30:33,750", "message": "Validated provider scopes", "levelname": "INFO", "name": "keep.providers.providers_service", "filename": "providers_service.py", "otelTraceID": "7466f0274a16c46dea7013b0be8a584f", "otelSpanID": "7bbeada4b8f1709f", "otelTraceSampled": true, "otelServiceName": "keep-api", "threadName": "MainThread", "process": 27, "module": "providers_service", "validated_scopes": {"eks:DescribeCluster": true, "eks:ListClusters": true, "pods:delete": "Failed to generate EKS client: No module named 'awscli'", "deployments:scale": "Failed to generate EKS client: No module named 'awscli'", "pods:list": "Failed to generate EKS client: No module named 'awscli'", "pods:get": "Failed to generate EKS client: No module named 'awscli'", "pods:logs": "Failed to generate EKS client: No module named 'awscli'"}}
{"worker_type": "uvicorn", "asctime": "2025-05-02 06:30:33,752", "message": "Writing secret", "levelname": "INFO", "name": "keep.secretmanager.secretmanager", "filename": "kubernetessecretmanager.py", "otelTraceID": "7466f0274a16c46dea7013b0be8a584f", "otelSpanID": "7bbeada4b8f1709f", "otelTraceSampled": true, "otelServiceName": "keep-api", "threadName": "MainThread", "process": 27, "module": "kubernetessecretmanager", "secret_name": "keep-eks-8c7069ea88c94b51bbe1759b6b170485"}
{"worker_type": "uvicorn", "asctime": "2025-05-02 06:30:33,789", "message": "Secret created/updated successfully", "levelname": "INFO", "name": "keep.secretmanager.secretmanager", "filename": "kubernetessecretmanager.py", "otelTraceID": "7466f0274a16c46dea7013b0be8a584f", "otelSpanID": "7bbeada4b8f1709f", "otelTraceSampled": true, "otelServiceName": "keep-api", "threadName": "MainThread", "process": 27, "module": "kubernetessecretmanager", "secret_name": "keep-eks-8c7069ea88c94b51bbe1759b6b170485"}
{"worker_type": "uvicorn", "asctime": "2025-05-02 06:30:33,974", "message": "Getting secret", "levelname": "INFO", "name": "keep.secretmanager.secretmanager", "filename": "kubernetessecretmanager.py", "otelTraceID": "746de70d5dd86c5f2b8775301337d14e", "otelSpanID": "b1b65c9d1aaecd56", "otelTraceSampled": true, "otelServiceName": "keep-api", "threadName": "AnyIO worker thread", "process": 27, "module": "kubernetessecretmanager", "secret_name": "keep-eks-8c7069ea88c94b51bbe1759b6b170485"}
{"worker_type": "uvicorn", "asctime": "2025-05-02 06:30:33,992", "message": "Got secret successfully", "levelname": "INFO", "name": "keep.secretmanager.secretmanager", "filename": "kubernetessecretmanager.py", "otelTraceID": "746de70d5dd86c5f2b8775301337d14e", "otelSpanID": "b1b65c9d1aaecd56", "otelTraceSampled": true, "otelServiceName": "keep-api", "threadName": "AnyIO worker thread", "process": 27, "module": "kubernetessecretmanager", "secret_name": "keep-eks-8c7069ea88c94b51bbe1759b6b170485"}
{"worker_type": "uvicorn", "asctime": "2025-05-02 06:30:42,835", "message": "Getting secret", "levelname": "INFO", "name": "keep.secretmanager.secretmanager", "filename": "kubernetessecretmanager.py", "otelTraceID": "f3ea5035001949cabeb4d713e59d2531", "otelSpanID": "847c41bf2285f9c4", "otelTraceSampled": true, "otelServiceName": "keep-api", "threadName": "AnyIO worker thread", "process": 27, "module": "kubernetessecretmanager", "secret_name": "keep-eks-8c7069ea88c94b51bbe1759b6b170485"}
{"worker_type": "uvicorn", "asctime": "2025-05-02 06:30:42,846", "message": "Got secret successfully", "levelname": "INFO", "name": "keep.secretmanager.secretmanager", "filename": "kubernetessecretmanager.py", "otelTraceID": "f3ea5035001949cabeb4d713e59d2531", "otelSpanID": "847c41bf2285f9c4", "otelTraceSampled": true, "otelServiceName": "keep-api", "threadName": "AnyIO worker thread", "process": 27, "module": "kubernetessecretmanager", "secret_name": "keep-eks-8c7069ea88c94b51bbe1759b6b170485"}
Steps To Reproduce
- Create IAM Role (with permission policy and trust policy) which has policies
arn:aws:iam::aws:policy/AmazonEKSServicePolicy,arn:aws:iam::aws:policy/AmazonEKSClusterPolicyand specific policy presented below:
{
"Statement": [
{
"Action": [
"eks:DescribeCluster",
"eks:ListClusters"
],
"Effect": "Allow",
"Resource": "arn:aws:eks:us-east-1:<your-acc-id>:cluster/*",
"Sid": "EksClusterReadOnly"
},
{
"Action": [
"eks:AccessKubernetesApi"
],
"Effect": "Allow",
"Resource": "arn:aws:eks:us-east-1:<your-acc-id>:cluster/*",
"Sid": "EksK8sApiReadOnly"
},
{
"Action": [
"autoscaling:DescribeAutoScalingGroups",
"ec2:Describe*",
"ec2:GetSecurityGroupsForVpc",
"elasticloadbalancing:Describe*",
"iam:ListAttachedRolePolicies",
"kms:DescribeKey",
"logs:DescribeLogStreams"
],
"Effect": "Allow",
"Resource": "*",
"Sid": "AwsEKSReadOnlyResources"
}
],
"Version": "2012-10-17"
}
- Create k8s serviceaccount annotated with this role:
apiVersion: v1
kind: ServiceAccount
metadata:
name: keep
namespace: monitoring
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::<your-acc-id>:role/keep
- Set is as serviceaccount for keep services in oss helm-chart:
serviceAccount:
create: false
annotations: {}
name: "keep"
<...>
frontend:
<...>
serviceAccount:
create: false
annotations: {}
name: "keep"
<etc>
- Run the release
Additional Information
Working with IAM user key pair is not quite secure (for GDPR, ISO27001, etc.), because there is no security credentials rotation.
More context in slack: URL
talboren
Metadata
Metadata
Labels
bugSomething isn't workingSomething isn't working