-
Notifications
You must be signed in to change notification settings - Fork 526
Description
Related: PRs #4183, #4185 | Issue #4180
Status: Investigation Complete | Target: 1.36+
Summary
kubernetes/release pins docker-ce-cli to 24.0.x (API 1.43) because test-infra DinD images have older Docker daemons. Latest docker-ce-cli is 29.0.0 (API 1.52), causing incompatibility: client version 1.52 is too new. Maximum supported API version is 1.43
Root Cause: test-infra Dockerfiles don't pin Docker versions → production images built months ago have Docker 24.x while new clients expect 1.52+
Goals
- Pin Docker daemon version in test-infra images (recommend: 27.x for API 1.46)
- Remove docker-ce-cli pinning from kubernetes/release
- Validate 357 DinD-enabled job configs across multiple SIGs
- Minimal disruption to releases
Investigation Findings
DinD Images & Scale
- Images:
bootstrapandkubekins-e2e-v2in test-infra installdocker-cewithout version pinning - Impact: 1,737 DinD-enabled jobs across 357 config files (release, CSI, cloud providers, networking, storage, node, Cluster API, KIND)
- Latest available: Docker 29.0.0 (API 1.52, breaks API < 1.44)
Docker Version Mapping
- Docker 24.0.x → API 1.43 (current in kubernetes/release)
- Docker 27.x → API 1.46 ⭐ Recommended
- Docker 29.0.0 → API 1.52 (breaks older clients)
Key Files
images/bootstrap/Dockerfile&images/kubekins-e2e-v2/Dockerfile- need version pinningimages/bootstrap/runner.sh- DinD initialization- 357 job configs with
preset-dind-enabled: "true"
Implementation Plan
Phase 1: Research ✅ - Investigation complete
Phase 2: Development - Pin Docker version in Dockerfiles, build staging images
Phase 3: Testing - Validate critical jobs (release, CSI, KIND)
Phase 4: Rollout - Non-critical → critical jobs, unpin docker-ce-cli in kubernetes/release
Phase 5: Cleanup - Remove PR #4183 workaround, update docs
Risks & Mitigations
- Docker 29.x breaks API < 1.44 → Use Docker 27.x instead
- 357 jobs to validate → Automate testing, focus on critical paths (release, CSI)
- Job failures during rollout → Phased rollout, maintain rollback capability
- KIND compatibility → Validate in testing phase
Next Steps
- Decide Docker version: 27.x (recommended) vs 29.x
- Pin version in
bootstrapandkubekins-e2e-v2Dockerfiles - Build staging images, test with critical jobs
- Phased rollout to production
- Remove docker-ce-cli pin from kubernetes/release
Coordination: SIG Testing, SIG Release (discuss in mailing list/meetings)