-
Notifications
You must be signed in to change notification settings - Fork 220
Open
Description
When using Flux Kustomization with postBuild.substituteFrom, I sometimes see flakiness where variables in spec.components are not substituted before reconciliation starts. This causes intermittent build failures.
Example error:
kustomization/aws-load-balancer-controller master@sha1:eb8e9d54 False False
kustomize build failed: accumulating components: loader.New "must build at directory: not a valid directory:
evalsymlink failure on '/tmp/kustomization-2899191055/infrastructure/addons/aws-load-balancer-controller/overlays/environments/${environment_name_full}'
: lstat /tmp/kustomization-2899191055/infrastructure/addons/aws-load-balancer-controller/overlays/environments/${environment_name_full}: no such file or directory"
After some retries the reconciliation succeeds, which suggests (?) a race condition in the kustomize-controller. The variables do eventually get substituted correctly.
We have the following setup:
flowchart TB
subgraph clusters
a2["infra-addons-flux-kustomization.yaml"]
a3["extra/kustomization.yaml (active addons)"]
end
subgraph infra["infrastructure/addons/*"]
b["addon folders (one per addon)<br>flux-kustomization → overlays/base → env/region components"]
end
subgraph subst["cell-metadata (configmap)"]
d["postBuild.substituteFrom<br>(${env}, ${region}, ${cluster}, etc.)"]
end
subgraph flux
e["kustomize controller"]
f["helm controller"]
end
g["rendered workloads in cluster"]
%% flow
a2 --> a3
a3 --> infra
infra --> e
d --> e
e -->|"helmrelease crs"| f
f --> g
0. In-Cluster ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: cell-metadata
namespace: flux-system
data:
environment_name_full: development
aws_region: eu-north-1
aws_cluster_name: example1. Cluster-Level Entrypoint
./clusters/eu-north-1/example/infra-addons-flux-kustomization.yaml:
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: infrastructure
namespace: flux-system
spec:
postBuild:
substituteFrom:
- kind: ConfigMap
name: cell-metadata
optional: false
interval: 1m0s
sourceRef:
kind: GitRepository
name: flux-system
path: ./clusters/eu-north-1/example/extra
prune: true./clusters/eu-north-1/example/extra/kustomization.yaml:
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../../../infrastructure/sources
- ../../../../infrastructure/addons/aws-load-balancer-controller
patches:
- target:
kind: Kustomization
group: kustomize.toolkit.fluxcd.io
patch: |-
- op: add
path: /spec/postBuild
value:
substituteFrom:
- kind: ConfigMap
name: cell-metadata
optional: false2. Example addon (aws-load-balancer-controller)
infrastructure/addons/aws-load-balancer-controller:
├── flux-kustomization.yaml
├── kustomization.yaml
├── namespace.yaml
└── overlays
├── base/
│ ├── helmrelease.yaml
│ └── kustomization.yaml
├── environments/
│ ├── development/kustomization.yaml
│ └── staging/kustomization.yaml
└── regions/
└── eu-north-1/kustomization.yaml
infrastructure/addons/aws-load-balancer-controller/flux-kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: aws-load-balancer-controller
namespace: flux-system
spec:
interval: 5m0s
path: ./infrastructure/addons/aws-load-balancer-controller/overlays/base
prune: true
components:
- ../environments/${environment_name_full}
- ../regions/${aws_region}
sourceRef:
kind: GitRepository
name: flux-system
deletionPolicy: WaitForTerminationI'm running:
- macOS 15.6.1 (but reproducible regardless of local OS since it’s in the controller).
- flux: v2.4.0
Metadata
Metadata
Assignees
Labels
No labels