forked from kubernetes-sigs/cluster-api-provider-aws
-
Notifications
You must be signed in to change notification settings - Fork 1
🐛fix: retrieve controlplane host from Cluster object #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
supershal
merged 1 commit into
nutanix-cloud-native:main
from
supershal:shalin/eks-cp-host
Oct 22, 2025
Merged
🐛fix: retrieve controlplane host from Cluster object #7
supershal
merged 1 commit into
nutanix-cloud-native:main
from
supershal:shalin/eks-cp-host
Oct 22, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
dkoshkin
approved these changes
Oct 22, 2025
Collaborator
dkoshkin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create a ticket to find the proper fix, but this works for now.
Thanks for the quick turnaround!
dlipovetsky
approved these changes
Oct 22, 2025
Collaborator
Author
|
Jira ticket in our internal system created to investigate root cause. |
faiq
pushed a commit
that referenced
this pull request
Nov 26, 2025
* deps: upgrade Kubernetes dependencies to v0.33.4 - Update core Kubernetes dependencies from v0.32.3 to v0.33.4: - k8s.io/api, k8s.io/apimachinery, k8s.io/client-go - k8s.io/apiserver, k8s.io/cli-runtime, k8s.io/kubectl - k8s.io/apiextensions-apiserver, k8s.io/component-base - Upgrade prometheus/client_golang from v1.19.1 to v1.22.0 - Update cel.dev/expr from v0.18.0 to v0.19.1 - Upgrade google/cel-go from v0.22.0 to v0.23.2 - Update golang.org/x/time from v0.8.0 to v0.9.0 - Upgrade gRPC from v1.67.3 to v1.68.1 - Update OpenTelemetry packages to v1.33.0 - Refresh k8s.io/utils and other indirect dependencies - Update kube-openapi and structured-merge-diff versions * deps: update cluster-api to v1.11.1 and controller-runtime to v0.21.0 - Upgrade cluster-api from v1.10.2 to v1.11.1 - Upgrade controller-runtime from v0.20.4 to v0.21.0 - Update various golang.org/x/* packages - Update testing dependencies (ginkgo, gomega) - Update OpenTelemetry and other indirect dependencies * WIP no IDE errors * WIP IDE Errors * Fix go dependencies Signed-off-by: Borja Clemente <[email protected]> * Update imports, code and generations to CAPI 1.11 - Update all imports to v1beta2 types except for conditions staying in v1beta1. - Adapt source code to work with v1beta2 and deprecated conditions. - Manually update conversions. Signed-off-by: Borja Clemente <[email protected]> * Update linting pkg alias and fix broken imports blocks Signed-off-by: Borja Clemente <[email protected]> * Remove unnecessary Paused constants Signed-off-by: Borja Clemente <[email protected]> * Fix import aliases Signed-off-by: Borja Clemente <[email protected]> * Fix broken imports Signed-off-by: Borja Clemente <[email protected]> * Revert public APIS back to v1beta1 while internally using v1beta2 Introducing v1beta2 on public types is a breaking change so they have to stay in v1beta1. Internally though, migration to v1beta2 is happening (except for conditions). Signed-off-by: Borja Clemente <[email protected]> * Revert infrav1 conditions to v1beta1 and consolidate imports Signed-off-by: Borja Clemente <[email protected]> * Consolidate conditions imports and fix linting Signed-off-by: Borja Clemente <[email protected]> * Fix regression in machine deployments without failure domain set Signed-off-by: Borja Clemente <[email protected]> * Revert missing public APIs to v1beta1 Signed-off-by: Borja Clemente <[email protected]> * Consolidate infrav1beta1 imports into infrav1 Signed-off-by: Borja Clemente <[email protected]> * Remove unused conditions constants Signed-off-by: Borja Clemente <[email protected]> * Fix setting wrong condition type Signed-off-by: Borja Clemente <[email protected]> * Cast v1beta1 conditions instead of creating a new constant Signed-off-by: Borja Clemente <[email protected]> * Revert changed public APIs and adapt internally to v1beta2 Signed-off-by: Borja Clemente <[email protected]> * Resolve conflicts with main Signed-off-by: Borja Clemente <[email protected]> * Add deprecated CAPI imports linter rule Add rule to allow using deprecated v1beta1 CAPI APIs and removed linter comments everywhere. Signed-off-by: Borja Clemente <[email protected]> * Apply review corrections Signed-off-by: Borja Clemente <[email protected]> * Adjust e2e and metadata versions Signed-off-by: Borja Clemente <[email protected]> * Apply review feedback on awscluster_webhook Signed-off-by: Borja Clemente <[email protected]> * FIx unit tests Signed-off-by: Borja Clemente <[email protected]> * Review feedback Signed-off-by: Borja Clemente <[email protected]> * Apply review feedback Signed-off-by: Borja Clemente <[email protected]> * Add CRD RBAC to the awsmachine controller Signed-off-by: Borja Clemente <[email protected]> * e2e: add v1beta1 CAPI scheme to clients and adjust modifyFunc test to use the new field name * Fix linting issues Signed-off-by: Borja Clemente <[email protected]> * Fix nodeDrainTimeoutSeconds field in clusterclass test Signed-off-by: Borja Clemente <[email protected]> * e2e: fix contract for CAPI * fix path again * e2e: fix contract for capa 9.99.99 (#3) * e2e: use correct type for setting field (#4) * rosa: deflake unit test (#5) * rosa: deflake unit test * fixup * e2e: fix config metadata and contract version pinning (#6) * e2e: fix config metadata file path Signed-off-by: Borja Clemente <[email protected]> * Bump KCP Template for clusterclass changes (#7) --------- Signed-off-by: Borja Clemente <[email protected]> Co-authored-by: Bryan Cox <[email protected]> Co-authored-by: Christian Schlotter <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What type of PR is this?
/kind bug
What this PR does / why we need it:
The nodeadmconfig template sets ControlPlaneEndpoint.Host from AWSManagedCluster object. However something resets value of
ControlPlaneEndpoint.Hostin theAWSManagedClusterobject randomly. This value is present when a nodepool is created first time. But it is removed when we try to add another nodepool later time. (I have not investigated what removes it)This results in failure with nodeadmconfig creation randomly. We get following error from CAPA
Following is the logs in our CAPA fork's code that fetches the ControlPlaneEndpoint.Host from AWSManagedClusters. You can notice how it gets resets at some point.
This PR fetches the ControlPlaneEndpoint from
Clusterobject where it always present.Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close the issue(s) when PR gets merged):Fixes #
Special notes for your reviewer:
We still need to investigate what resets the controlplaneendpoint.host in awsmanagedclusters. This can be fixed with separate PR.
Checklist:
Release note: