fix: restore post-connect fallback for multi-network stacks on API < 1.44#13629
fix: restore post-connect fallback for multi-network stacks on API < 1.44#13629jotka wants to merge 2 commits intodocker:mainfrom
Conversation
af22a2b to
7895a5a
Compare
|
Good catch to flag — I had the same instinct initially. But after scanning the codebase, this comparison is safe due to a structural guarantee in compose-go:
The only case where the loop runs is the standard That said, the codebase itself identifies primary by position ( for i, networkKey := range serviceNetworks {
if i == 0 {
continue // primary, already in ContainerCreate
}Functionally equivalent here, but arguably more explicit about intent. Let me know. |
There was a problem hiding this comment.
Pull request overview
Restores Docker Compose’s backward-compatible multi-network container creation behavior for Docker Engine API versions older than 1.44, preventing ContainerCreate failures on legacy daemons (e.g., Docker 20.10 / Synology DSM).
Changes:
- Reintroduced API-version gating so
ContainerCreateonly includes multipleEndpointsConfigentries whenapiVersion >= 1.44. - Restored the legacy post-create network attachment path (
NetworkConnect) forapiVersion < 1.44, and ensuredContainerInspecthappens after legacy network connects. - Added/updated unit tests covering the
< 1.44behavior in both network config generation and container creation.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/compose/create.go | Adds APIVersion to create configs and gates extra EndpointsConfig entries behind apiVersion >= 1.44. |
| pkg/compose/convergence.go | Restores legacy NetworkConnect loop for < 1.44 before inspecting the container. |
| pkg/compose/api_versions.go | Documents/defines the apiVersion144 threshold and its behavioral implications. |
| pkg/compose/create_test.go | Adds a subtest asserting only the primary network is included for API 1.43. |
| pkg/compose/convergence_test.go | Adds a legacy API test asserting NetworkConnect is used and ContainerInspect occurs after connects. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
glours
left a comment
There was a problem hiding this comment.
Hey @jotka
Thanks for this fix, unfortunately it could introduced creation of unwanted orphan containers when the process can't create endpoint or attach to networks.
There is also a separate issue where the fallback decision is based on the daemon’s advertised max API version instead of the negotiated client API version; that part has been handled separately in PR #13690.
| epSettings, err := createEndpointSettings(project, service, number, networkKey, cfgs.Links, opts.UseNetworkAliases) | ||
| if err != nil { | ||
| return created, err | ||
| } | ||
| if _, err := s.apiClient().NetworkConnect(ctx, mobyNetworkName, client.NetworkConnectOptions{ | ||
| Container: response.ID, | ||
| EndpointConfig: epSettings, | ||
| }); err != nil { | ||
| return created, err |
There was a problem hiding this comment.
For API < 1.44, this PR now creates the container first and then attaches secondary networks one by one here. If one of these NetworkConnect() calls fails, the function returns immediately and leaves the just-created container behind on the primary network. That means the legacy fallback is no longer atomic on failure, and a later up / recreate can hit a name conflict against the orphaned container.
I think we should clean up the created container on NetworkConnect failure before returning.
There was a problem hiding this comment.
Good catch, thanks! Added ContainerRemove with Force:true on both error paths (createEndpointSettings and NetworkConnect). Also added a test for the failure case. Re the API version negotiation — noted, will rebase on top of #13690 once it lands.
There was a problem hiding this comment.
FYI PR #13690 has been merged
ps: don't forgot to sign-off your commits 😉
8e26cb7 to
1e13ed6
Compare
|
Rebased on main (post go1.25.8, moby/client v1.54.0, etc.) and fixed lint issues (line length in test closures). All tests pass locally. |
Before API 1.44 (Docker Engine 25.0), ContainerCreate only accepted a single EndpointsConfig entry. Passing multiple entries silently drops all but one network, leaving containers under-connected on older daemons such as Docker 20.10 or Synology DSM 7.1/7.2. Add apiVersion144 constant and wrap the secondary-networks loop in defaultNetworkSettings so that only the primary network is included in the ContainerCreate call when the negotiated API version is below 1.44. Signed-off-by: jarek <jkrochmalski@gmail.com>
For Docker daemons older than API 1.44, the extra networks omitted from ContainerCreate must be connected individually after creation via NetworkConnect. If any NetworkConnect call fails, remove the freshly created container to prevent orphans. Add two tests: - TestCreateMobyContainerLegacyAPI: happy path verifying NetworkConnect is called for the secondary network on API 1.43 - TestCreateMobyContainerLegacyAPI_NetworkConnectFailure: verifies ContainerRemove is called with Force when NetworkConnect fails Signed-off-by: jarek <jkrochmalski@gmail.com>
1e13ed6 to
4e64b82
Compare
|
Rebased on main (post #13690 merge). Changes:
All tests pass, linter clean. |
glours
left a comment
There was a problem hiding this comment.
Sounds good to me 👍
Thanks for the contribution
I'll do a release later this week so you can update Dockhand on your side
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Problem
Docker Compose 5.1.0 broke multi-network stacks on Docker daemons older than API 1.44 (Docker Engine < 23.0), including Synology DSM 7.1 (API 1.41) and DSM 7.2 (API 1.43). The error is:
This is a regression from 5.0.2 which worked correctly on these daemons.
Closes #13628
Context
I maintain Dockhand, a Docker management application. Many of our users run Dockhand on Synology NAS devices (DSM 7.1/7.2), which ship with Docker 20.10 (API 1.41). This regression broke multi-network stack deployments for all of them when we updated docker-compose to 5.1.0. We had to cap docker-compose at 5.0.2 as a workaround while investigating the root cause.
Root Cause
Both regressions were introduced in commit bfb5511 (
go.mod: bump github.com/moby/moby/api v1.53.0, moby/client v0.2.2):1.
defaultNetworkSettings()increate.goThe
versions.GreaterThanOrEqualTo(version, APIVersion144)guard was removed during the moby API bump, makingdefaultNetworkSettingsunconditionally include ALL networks inEndpointsConfigforContainerCreate. Old daemons (< API 1.44) only accept a single entry and reject the request with the error above.2. The post-connect fallback was removed from
convergence.goThe same commit removed the
NetworkConnectfallback loop fromcreateMobyContainer(). Previously (since aaa7ef6), afterContainerCreate, the code checkedversions.LessThan(apiVersion, "1.44")and connected extra networks one-by-one viaNetworkConnect(). This entire block was dropped during the API migration.Fix
First commit (
4f14ab8): Restore backward compatRestores the two behaviors from 5.0.2:
create.go—defaultNetworkSettings(): Only include extra networks inEndpointsConfigwhenapiVersion >= 1.44(guarded by!versions.LessThan(version, apiVersion144)).convergence.go—createMobyContainer(): Restore the post-connect fallback. After container creation, whenapiVersion < 1.44, iterate the service's extra networks and callNetworkConnect()for each, skipping the primary network (already configured viaNetworkModeinContainerCreate).Second commit (
7895a5a)Pass
APIVersionthroughcreateConfigs: AddedAPIVersionfield to thecreateConfigsstruct, populated from the existingapiVersionlocal ingetCreateConfigs(). This eliminates the redundants.RuntimeVersion(ctx)call increateMobyContainer.Reorder
ContainerInspectafterNetworkConnect: Moved the inspect call after theNetworkConnectloop so the returnedcontainer.Summaryincludes all attached networks.Unit tests for the < 1.44 code path:
TestDefaultNetworkSettingssub-test with version"1.43"— assertsEndpointsConfigcontains only the primary network.TestCreateMobyContainerLegacyAPI— assertsContainerCreategets only the primary network,NetworkConnectis called for the secondary, andContainerInspectruns afterNetworkConnect.Testing
TestRunHook_ConsoleSizefailure is a pre-existing PTY crash on macOS/arm64, unrelated to this change and present onmainbefore this fix)ContainerCreatepath