Skip to content

fix: allow containers to start using a large numbers of ports #4290

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

haytok
Copy link
Contributor

@haytok haytok commented May 25, 2025

Suppose we have a compose.yaml that allocates a large numbers of ports as follows.

> cat compose.yaml
services:
  svc0:
    image: alpine
    command: "sleep infinity"
    ports:
      - '32000-32060:32000-32060'

When we run nerdctl compose up -d using this compose.yaml, we will get the following error.

FATA[0000] create container failed validation: containers.Labels: label key and value length (4711 bytes) greater than maximum size (4096 bytes), key: nerdctl/ports: invalid argument
FATA[0000] error while creating container haytok-svc0-1: error while creating container haytok-svc0-1: exit status 1

This issue is reported in the following issue.

This issue is considered to be the same as the one with errors when trying to perform many port mappings, such as nerdctl run -p 80:80 -p 81:81 ~ -p 1000:1000 ...

The current implementation is processing to create a container with the information specified in -p to the label.
And as can be seen from the error message, as the number of ports to be port mapped increases, the creation of the container fails because it violates the limit of the maximum number of bytes on the containerd side that can be allocated for a label.

Therefore, this PR modifies the container creation process so that containers can be launched without having to assign the information specified in the -p option to the labels.

Specifically, port mapping information is stored in the following path, and when port mapping information is required, it is retrieved from this file.

<DATAROOT>/<ADDRHASH>/containers/<NAMESPACE>/<CID>/port-mappings.json

@haytok
Copy link
Contributor Author

haytok commented May 25, 2025

Because of my modifications, CI fails.
So, I'll check the root cause of CI failures.

@apostasie
Copy link
Contributor

Why are we not storing the range itself in the label?

@haytok
Copy link
Contributor Author

haytok commented May 27, 2025

Hi, @apostasie Thanks for commnets!

Why are we not storing the range itself in the label?

I also think of a way to keep the range of ports in the label, but the following edge case would violate the upper limit of the number of labels.
Specifically, when host, port and protocol are random, it is difficult to combine ranges.

nerdctl run -p 127.0.0.1:32000:32000 -p 127.0.0.2:32001:32001 ~ -p 127.0.0.254:32253:32253 nginx

Considering these cases, I think that when the port mapping information is given to the Label, it would violate the upper limit, so I implemented the way of this PR so that the port mapping information is not retained in the Label.


When nerdctl compose up is executed, the ports specified in compose.yaml are expanded and executed by the following function:

The ports are expanded and executed as nerdctl run -p 32000:32000 ~ -p 32060:32060 ....

@haytok haytok force-pushed the issue_4027 branch 2 times, most recently from 02db289 to c6e55d1 Compare May 28, 2025 01:01
@AkihiroSuda AkihiroSuda added this to the v2.1.3 milestone May 30, 2025
@apostasie
Copy link
Contributor

@AkihiroSuda so, this PR does move some of the network configuration information from labels (or annotations) to the filesystem.

Future work to support per-network IP assignment (and options) will require significant re-jiggle of other annotations (nerdctl/network, nerdctl/ip, etc).

That overall does beg the question: should we stop using labels/annotations to store network configuration altogether, and just use a filesystem store for that instead?
Was there a specific reason we wanted network config to be stored on the container in the first place?

Answer to this question should inform whether we should consider expanding the scope of this PR a little bit to come up with a generic net-config-store that would in the future accommodate for other changes and additions ^.

@AkihiroSuda
Copy link
Member

That overall does beg the question: should we stop using labels/annotations to store network configuration altogether, and just use a filesystem store for that instead?

👍

We have to keep existing containers functional, though, if we are going to merge this in v2.1.x.

@haytok
Copy link
Contributor Author

haytok commented Jun 1, 2025

Thanks for comments !!!

I understand that it is acceptable to implement a policy of storing port mapping information in a local file.
Therefore, I'll proceed with the implementation in this PR and make corrections so that file writing can be performed atomically.

@haytok haytok force-pushed the issue_4027 branch 2 times, most recently from 854dce9 to 16ddd16 Compare June 2, 2025 14:28
Suppose we have a compose.yaml that allocates a large numbers of ports as
follows.

```
> cat compose.yaml
services:
  svc0:
    image: alpine
    command: "sleep infinity"
    ports:
      - '32000-32060:32000-32060'
```

When we run `nerdctl compose up -d` using this compose.yaml, we will get
the following error.

```
FATA[0000] create container failed validation: containers.Labels: label key and value length (4711 bytes) greater than maximum size (4096 bytes), key: nerdctl/ports: invalid argument
FATA[0000] error while creating container haytok-svc0-1: error while creating container haytok-svc0-1: exit status 1
```

This issue is reported in the following issue.

- containerd#4027

This issue is considered to be the same as the one with errors when
trying to perform many port mappings, such as `nerdctl run -p 80:80 -p 81:81 ~ -p 1000:1000 ...`

The current implementation is processing to create a container with the
information specified in -p to the label.
And as can be seen from the error message, as the number of ports to be
port mapped increases, the creation of the container fails because it
violates the limit of the maximum number of bytes on the containerd side
that can be allocated for a label.

Therefore, this PR modifies the container creation process so that
containers can be launched without having to assign the information
specified in the -p option to the labels.

Specifically, port mapping information is stored in the following path,
and when port mapping information is required, it is retrieved from this
file.

```
<DATAROOT>/<ADDRHASH>/containers/<NAMESPACE>/<CID>/network-config.json
```

Signed-off-by: Hayato Kiwata <[email protected]>
@haytok
Copy link
Contributor Author

haytok commented Jun 6, 2025

Hi @AkihiroSuda @apostasie

Could you review when you have time ? 🙏

@@ -57,9 +57,6 @@ const (
// Currently, the length of the slice must be 1.
Networks = Prefix + "networks"

// Ports is a JSON-marshalled string of []cni.PortMapping .
Ports = Prefix + "ports"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems to break ports of containers created with an existing version of nerdctl ?

Then probably we should postpone this to v2.2.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for comments!!!

This change is not backward compatible, so these backward incompatible changes should be postponed until the update from 2.1 to 2.2.
If so, is it okay to leave this PR open for now until the release of 2.2 approaches?
I'll rebase as necessary and ask you to review this PR when the release of 2.2 approaches.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please.

There should be also a human-friendly warning message when an incompatible container exists.

Copy link
Contributor Author

@haytok haytok Jun 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add handling for cases where there are containers that are not backward compatible (e.g., when containers created using the -p option are already exist).

@AkihiroSuda AkihiroSuda modified the milestones: v2.1.3, v2.2.0 (?) Jun 6, 2025
@@ -892,12 +892,6 @@ func NetworkOptionsFromSpec(spec *specs.Spec) (types.NetworkOptions, error) {
}
opts.NetworkSlice = networks

if portsJSON := spec.Annotations[labels.Ports]; portsJSON != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PortMappings no longer set here?

@apostasie
Copy link
Contributor

@haytok

What if we make this change backward compatible?

eg:

func LoadPortMappings(dataStore, namespace, id string) ([]cni.PortMapping, error) {
      // if there is no port information from NetworkStore, look into the labels

@apostasie
Copy link
Contributor

apostasie commented Jun 6, 2025

@haytok a few comments above - also I think we should make this backward compatible (at least for reading the ports from the labels if they are there).

@haytok
Copy link
Contributor Author

haytok commented Jun 7, 2025

Hi @apostasie

Thanks for commens!!!

I hadn't considered backward compatibility at all, so I'll update based on your advices to make backward compatible.

@haytok
Copy link
Contributor Author

haytok commented Jun 17, 2025

I've added the logic to maintain backward compatibility.

However, maybe all failures are flaky ...

canary / go canary failures are related to #4318.

  • [flaky, see #3988] / EL / almalinux-9 rootful (pull_request) Failing after 16m
Details

2025-06-17T14:53:34.7593504Z     container_logs_test.go:313: 
2025-06-17T14:53:34.7593560Z         
2025-06-17T14:53:34.7593715Z         +------------------------------------------------------------------------------------------------------------+
2025-06-17T14:53:34.7594063Z         | ➡️      | ⚙️ /usr/local/bin/nerdctl logs -f testlogswithoutnewlineoreof-f625832d                           |
2025-06-17T14:53:34.7594222Z         +------------------------------------------------------------------------------------------------------------+
2025-06-17T14:53:34.7594417Z         | 🌱      | HOME=/root                                                                                       |
2025-06-17T14:53:34.7594667Z         |         | PATH=/usr/local/go/bin:/usr/local/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin |
2025-06-17T14:53:34.7594770Z         |         | :/bin                                                                                            |
2025-06-17T14:53:34.7595027Z         |         | DOCKER_CONFIG=/tmp/TestLogsWithoutNewlineOrEOF3385429920/003                                     |
2025-06-17T14:53:34.7595290Z         |         | NERDCTL_TOML=/tmp/TestLogsWithoutNewlineOrEOF3385429920/003/nerdctl.toml                         |
2025-06-17T14:53:34.7595526Z         +------------------------------------------------------------------------------------------------------------+
2025-06-17T14:53:34.7595727Z         || <1s (limit: 3m0s)                                                                                |
2025-06-17T14:53:34.7595888Z         +------------------------------------------------------------------------------------------------------------+
2025-06-17T14:53:34.7596214Z         | 📁      | /tmp/TestLogsWithoutNewlineOrEOF3385429920/002                                                   |
2025-06-17T14:53:34.7596372Z         +------------------------------------------------------------------------------------------------------------+
2025-06-17T14:53:34.7596434Z         
2025-06-17T14:53:34.7596977Z     container_logs_test.go:313: 	🔗 �]8;;file:///go/src/github.com/containerd/nerdctl/cmd/nerdctl/container/container_logs_test.go#313:1�testCase.Run(t)�]8;;��[0m
2025-06-17T14:53:34.7597042Z         
2025-06-17T14:53:34.7597110Z         <<<<<<<<<<<<<<<<<<<<
2025-06-17T14:53:34.7597242Z         	🖊️ Inspecting output (equals)
2025-06-17T14:53:34.7597339Z         	👀 testing:		``
2025-06-17T14:53:34.7597450Z         	❌ FAILED!		= `'Hello World!
2025-06-17T14:53:34.7597532Z         There is no newline'`
2025-06-17T14:53:34.7597595Z         >>>>>>>>>>>>>>>>>>>>

  • lint / go / linux (go canary) (pull_request) Failing after 1m
Details

Error: ../../../go/pkg/mod/golang.org/x/[email protected]/internal/tokeninternal/tokeninternal.go:78:9: invalid array length -delta * delta (constant -256 of type int64)
make: *** [Makefile:221: install-dev-tools] Error 1
Error: Process completed with exit code 2.

  • test / in-container / rootful (canary) (pull_request) Failing after 5m
Details

Dockerfile:321
--------------------
 319 |     ENV PATH=/usr/local/go/bin:$PATH
 320 |     ARG GOTESTSUM_VERSION
 321 | >>> RUN GOBIN=/usr/local/bin go install gotest.tools/gotestsum@${GOTESTSUM_VERSION}
 322 |     COPY . /go/src/github.com/containerd/nerdctl
 323 |     WORKDIR /go/src/github.com/containerd/nerdctl
--------------------
ERROR: failed to solve: process "/bin/sh -c GOBIN=/usr/local/bin go install gotest.tools/gotestsum@${GOTESTSUM_VERSION}" did not complete successfully: exit code: 1
Error: Process completed with exit code 1.

  • test / in-host / windows (canary) (pull_request) Failing after 3m
Details

Error: C:\Users\runneradmin\go\pkg\mod\golang.org\x\[email protected]\internal\tokeninternal\tokeninternal.go:78:9: invalid array length -delta * delta (constant -256 of type int64)
make: *** [Makefile:221: install-dev-tools] Error 1
Error: Process completed with exit code 2.

@apostasie
Copy link
Contributor

Yes, all invalid array length -delta * delta (constant -256 of type int64) you can ignore.

TestLogsWithoutNewlineOrEOF is likely #4147

}
log.L.Warnf("container %s is using legacy port mapping configuration. To ensure compatibility with the new port mapping logic, please recreate this container. For more details, see: https://github.com/containerd/nerdctl/pull/4290", id[:12])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we actually do not need this warning?
We are compatible with legacy port mapping, so, it is transparent for users - eg: there is no real reason to urge them to recreate containers until we break compatibility / remove the compat layer in 2.X.

Let see what others think about this though ^

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the advice I received below, this warning message is displayed.

If at some stage (release) the logic for obtaining port mapping information from the label is to be deprecated, I think this warning message should be displayed.

However, I would like to hear the opinions of other maintainers as well.


return ns.safeStore.WithLock(func() error {
doesExist, err := ns.safeStore.Exists(networkConfigName)
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

if err != nil || !doesExist {
  return err
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks !!! fixed.

}

data, err := ns.safeStore.Get(networkConfigName)
if err == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Personally, I find it more readable to first deal with error conditions (and return).
Then the rest does not have to be inside an if/then test.

Eg:

if err != nil {
  if errors.Is(err, store.ErrNotFound) {
    err = nil
  }
  return err
}

var ports []cni.PortMapping
if err := json.Unmarshal(data, &ports); err != nil {
  return fmt.Errorf("failed to parse port mappings %v: %w", ports, err)
}
ns.PortMappings = ports

return nil

This is really personal preference, so, if you don't like it, fine :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code you suggested is easier to read, so I've fixed based on your advice! Thank you!

@apostasie
Copy link
Contributor

Thanks @haytok - this is good progress on solving this.

  1. I have a couple of (minor) nits, but otherwise looks good
  2. I am still annoyed at the fact we are unable to store internally port ranges
    eg:
nerdctl ps
CONTAINER ID    IMAGE                              COMMAND             CREATED          STATUS    PORTS                                                                                                                                                                                                                                                                                           NAMES
93a2736dd462    docker.io/library/alpine:latest    "sleep infinity"    3 seconds ago    Up        0.0.0.0:22200->22200/tcp, 0.0.0.0:22201->22201/tcp, 0.0.0.0:22202->22202/tcp, 0.0.0.0:22203->22203/tcp, 0.0.0.0:22204->22204/tcp, 0.0.0.0:22205->22205/tcp, 0.0.0.0:22206->22206/tcp, 0.0.0.0:22207->22207/tcp, 0.0.0.0:22208->22208/tcp, 0.0.0.0:22209->22209/tcp, 0.0.0.0:22210->22210/tcp    alpine

This is really ugly ^ - docker does a much better job.

I appreciate this aspect would be more complicated and possibly involve additional changes in compose, but yeah, I would much prefer that:
nerddctl -p 22200-22210:22200-22210 to be stored as ranges and not expanded.

@apostasie
Copy link
Contributor

Note, I am not suggesting that we try to reconstitute ranges from discrete ports specs, but that we do store the ranges when ranges are provided

This commit adds logic to maintain backward compatibility.

Signed-off-by: Hayato Kiwata <[email protected]>
@haytok
Copy link
Contributor Author

haytok commented Jun 20, 2025

Thanks for comments!!! @apostasie

I also think the PORTS display in nerdctl ps and nerdctl compose ps results is hard to read... :(

First, I believe this is caused by the current implementation where the logic for displaying PORTS cannot aggregate ranges.

func FormatPorts(labelMap map[string]string) string {
ports, err := portutil.ParsePortsLabel(labelMap)
if err != nil {
log.L.Error(err.Error())
}
if len(ports) == 0 {
return ""
}
strs := make([]string, len(ports))
for i, p := range ports {
strs[i] = fmt.Sprintf("%s:%d->%d/%s", p.HostIP, p.HostPort, p.ContainerPort, p.Protocol)
}
return strings.Join(strs, ", ")
}

I think this behavior occurs even in the latest version, but to fundamentally fix this readability issue, rather than saving ranges when port ranges are specified in -p, I think rewriting FormatPorts() would lead to a more fundamental solution to the problem.

For example, when we specify port ranges in compose.yaml as shown below, each port mapping gets expanded to -p and nerdctl run is executed.
Therefore, whether ranges are specified in -p or multiple -p options are specified, it would be better if PORTS could display them as ranges.

Details

> cat compose.yaml
services:
  nginx:
    image: nginx:latest
    container_name: nginx
    ports:
      - '80-81:80-81'
      - '80-81:80-81/udp'

> sudo nerdctl compose up
...
INFO[0000] Running [/usr/local/bin/nerdctl run --cidfile=/tmp/compose-2285964703/cid -l=com.docker.compose.project=ec2-user -l=com.docker.compose.service=nginx -d --name=nginx --pull=never --net=ec2-user_default --hostname=nginx -p=80:80/tcp -p=81:81/tcp -p=80:80/udp -p=81:81/udp --restart=no nginx:latest]
...

However, I think fixing this PORTS display is beyond the scope of this PR, so how about creating a separate issue and addressing it there?
I will continue to work on it.
If it would be better to add the logic for merging PORTS display to this PR's commit, I can handle it here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants