-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Module
Core
Testcontainers version
1.17.5
Using the latest Testcontainers version?
Yes
Host OS
Linux
Host Arch
amd64
Docker version
Client:
Version: 20.10.19+dfsg1
API version: 1.41
Go version: go1.19.2
Git commit: d85ef84
Built: Sat Oct 15 15:47:37 2022
OS/Arch: linux/amd64
Context: default
Experimental: true
Server:
Engine:
Version: 20.10.19+dfsg1
API version: 1.41 (minimum version 1.12)
Go version: go1.19.2
Git commit: c964641
Built: Sat Oct 15 15:47:37 2022
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.8~ds1
GitCommit: 1.6.8~ds1-1+b1
runc:
Version: 1.1.4+ds1
GitCommit: 1.1.4+ds1-1
docker-init:
Version: 0.19.0
GitCommit:What happened?
The problem we are seeing is that when the container startup has failed the configured number of times (1 in our case), the container startup isn't immediately aborted; wait strategies are (unnecessarily) given time to succeed.
We are using a number of WaitStrategy instances for waiting on certain messages on startup, like this:
WaitAllStrategy waitAllStrategy = new WaitAllStrategy()
.withStrategy( new LogMessageContainsWaitStrategy( "org.apache.catalina.startup.Catalina.start Server startup in" ) )
.withStrategy( new ClientConnectorWaitStrategy( IntegrationTestConstants.CLIENT_CONNECTOR_PORT, timeout ) )
.withStartupTimeout( timeout );
for ( String webappPath : webappsToPaths.values() ) {
waitAllStrategy.withStrategy( new LogMessageContainsWaitStrategy( "Deployment of web application directory [" + webappPath + "] has finished" ) );
}
waitingFor( waitAllStrategy );I thought the problem could be related to the above since the timeout was exactly timeout seconds in this case (configured to be 240 seconds). Changing this 240 setting to e.g. 10 also made it abort faster, after 10 seconds.
But then I removed the waitingFor call altogether, and this made the startup wait for 60 seconds - the default. This was on Testcontainers 1.15.1 (we are still on an older versions because of some bugs related to the new version). I tested upgrading to 1.17.5 but unfortunately dropping the waitingFor call there uses the default wait strategy which seems to error out in our case, so it's hard to verify if the original problem is still present in 1.17.5. Here's the log for that failure:
[null] WARN 2022-10-27 15:09:19.187 [o.t.c.w.i.InternalCommandPortListeningCheck] :: An exception while executing the internal check: Container.ExecResult(exitCode=137, stdout=/proc/net/tcp: 1: 00000000:1F90 00000000:0000 0A 00000000:00000001 00:00000000 00000000 1000 0 2734309 2 00000000ca212241 100 0 0 10 0
/proc/net/tcp: 2: 0100007F:B210 0100007F:1F90 05 00000000:00000000 03:00001760 00000000 0 0 0 3 00000000793cfa31
/proc/net/tcp: 3: 0100007F:1F90 0100007F:B210 08 00000000:00000001 00:00000000 00000000 0 0 0 1 0000000065057682 20 4 0 10 -1
/proc/net/tcp: 1: 00000000:1F90 00000000:0000 0A 00000000:00000001 00:00000000 00000000 1000 0 2734309 2 00000000ca212241 100 0 0 10 0
/proc/net/tcp: 2: 0100007F:B210 0100007F:1F90 05 00000000:00000000 03:00001747 00000000 0 0 0 3 00000000793cfa31
/proc/net/tcp: 3: 0100007F:1F90 0100007F:B210 08 00000000:00000001 00:00000000 00000000 0 0 0 1 0000000065057682 20 4 0 10 -1
/proc/net/tcp: 1: 00000000:1F90 00000000:0000 0A 00000000:00000001 00:00000000 00000000 1000 0 2734309 2 00000000ca212241 100 0 0 10 0
/proc/net/tcp: 2: 0100007F:B210 0100007F:1F90 05 00000000:00000000 03:0000173A 00000000 0 0 0 3 00000000793cfa31
/proc/net/tcp: 3: 0100007F:1F90 0100007F:B210 08 00000000:00000001 00:00000000 00000000 0 0 0 1 0000000065057682 20 4 0 10 -1
/proc/net/tcp: 1: 00000000:1F90 00000000:0000 0A 00000000:00000001 00:00000000 00000000 1000 0 2734309 2 00000000ca212241 100 0 0 10 0
/proc/net/tcp: 2: 0100007F:B210 0100007F:1F90 05 00000000:00000000 03:0000172E 00000000 0 0 0 3 00000000793cfa31
/proc/net/tcp: 3: 0100007F:1F90 0100007F:B210 08 00000000:00000001 00:00000000 00000000 0 0 0 1 0000000065057682 20 4 0 10 -1
/proc/net/tcp: 1: 00000000:1F90 00000000:0000 0A 00000000:00000001 00:00000000 00000000 1000 0 2734309 2 00000000ca212241 100 0 0 10 0
/proc/net/tcp: 2: 0100007F:B210 0100007F:1F90 05 00000000:00000000 03:00001722 00000000 0 0 0 3 00000000793cfa31
/proc/net/tcp: 3: 0100007F:1F90 0100007F:B210 08 00000000:00000001 00:00000000 00000000 0 0 0 1 0000000065057682 20 4 0 10 -1
/proc/net/tcp: 1: 00000000:1F90 00000000:0000 0A 00000000:00000001 00:00000000 00000000 1000 0 2734309 2 00000000ca212241 100 0 0 10 0
/proc/net/tcp: 2: 0100007F:B210 0100007F:1F90 05 00000000:00000000 03:00001716 00000000 0 0 0 3 00000000793cfa31
/proc/net/tcp: 3: 0100007F:1F90 0100007F:B210 08 00000000:00000001 00:00000000 00000000 0 0 0 1 0000000065057682 20 4 0 10 -1
/proc/net/tcp: 1: 00000000:1F90 00000000:0000 0A 00000000:00000001 00:00000000 00000000 1000 0 2734309 2 00000000ca212241 100 0 0 10 0
/proc/net/tcp: 2: 0100007F:B210 0100007F:1F90 05 00000000:00000000 03:0000170B 00000000 0 0 0 3 00000000793cfa31
/proc/net/tcp: 3: 0100007F:1F90 0100007F:B210 08 00000000:00000001 00:00000000 00000000 0 0 0 1 0000000065057682 20 4 0 10 -1
/proc/net/tcp: 1: 00000000:1F90 00000000:0000 0A 00000000:00000001 00:00000000 00000000 1000 0 2734309 2 00000000ca212241 100 0 0 10 0
/proc/net/tcp: 2: 0100007F:B210 0100007F:1F90 05 00000000:00000000 03:00001700 00000000 0 0 0 3 00000000793cfa31
/proc/net/tcp: 3: 0100007F:1F90 0100007F:B210 08 00000000:00000001 00:00000000 00000000 0 0 0 1 0000000065057682 20 4 0 10 -1
/proc/net/tcp: 1: 00000000:1F90 00000000:0000 0A 00000000:00000001 00:00000000 00000000 1000 0 2734309 2 00000000ca212241 100 0 0 10 0
/proc/net/tcp: 2: 0100007F:B210 0100007F:1F90 05 00000000:00000000 03:000016F4 00000000 0 0 0 3 00000000793cfa31
/proc/net/tcp: 3: 0100007F:1F90 0100007F:B210 08 00000000:00000001 00:00000000 00000000 0 0 0 1 0000000065057682 20 4 0 10 -1
/proc/net/tcp: 1: 00000000:1F90 00000000:0000 0A 00000000:00000001 00:00000000 00000000 1000 0 2734309 2 00000000ca212241 100 0 0 10 0
/proc/net/tcp: 2: 0100007F:B210 0100007F:1F90 05 00000000:00000000 03:000016E9 00000000 0 0 0 3 00000000793cfa31
/proc/net/tcp: 3: 0100007F:1F90 0100007F:B210 08 00000000:00000001 00:00000000 00000000 0 0 0 1 0000000065057682 20 4 0 10 -1
/proc/net/tcp: 1: 00000000:1F90 00000000:0000 0A 00000000:00000001 00:00000000 00000000 1000 0 2734309 2 00000000ca212241 100 0 0 10 0
/proc/net/tcp: 2: 0100007F:B210 0100007F:1F90 05 00000000:00000000 03:000016DD 00000000 0 0 0 3 00000000793cfa31
/proc/net/tcp: 4: 0100007F:1F90 0100007F:B210 08 00000000:00000001 00:00000000 00000000 0 0 0 1 0000000065057682 20 4 0 10 -1
/proc/net/tcp: 1: 00000000:1F90 00000000:0000 0A 00000000:00000001 00:00000000 00000000 1000 0 2734309 2 00000000ca212241 100 0 0 10 0
/proc/net/tcp: 2: 0100007F:B210 0100007F:1F90 05 00000000:00000000 03:000016D2 00000000 0 0 0 3 00000000793cfa31
/proc/net/tcp: 11: 0100007F:1F90 0100007F:B210 08 00000000:00000001 00:00000000 00000000 0 0 0 1 0000000065057682 20 4 0 10 -1
/proc/net/tcp: 1: 00000000:1F90 00000000:0000 0A 00000000:00000001 00:00000000 00000000 1000 0 2734309 2 00000000ca212241 100 0 0 10 0
/proc/net/tcp: 2: 0100007F:B210 0100007F:1F90 05 00000000:00000000 03:000016C7 00000000 0 0 0 3 00000000793cfa31
/proc/net/tcp: 12: 0100007F:1F90 0100007F:B210 08 00000000:00000001 00:00000000 00000000 0 0 0 1 0000000065057682 20 4 0 10 -1
/proc/net/tcp: 1: 00000000:1F90 00000000:0000 0A 00000000:00000001 00:00000000 00000000 1000 0 2734309 2 00000000ca212241 100 0 0 10 0
/proc/net/tcp: 2: 0100007F:B210 0100007F:1F90 05 00000000:00000000 03:000016BA 00000000 0 0 0 3 00000000793cfa31
/proc/net/tcp: 12: 0100007F:1F90 0100007F:B210 08 00000000:00000001 00:00000000 00000000 0 0 0 1 0000000065057682 20 4 0 10 -1
/proc/net/tcp: 1: 00000000:1F90 00000000:0000 0A 00000000:00000001 00:00000000 00000000 1000 0 2734309 2 00000000ca212241 100 0 0 10 0
/proc/net/tcp: 2: 0100007F:B210 0100007F:1F90 05 00000000:00000000 03:000016A2 00000000 0 0 0 3 00000000793cfa31
/proc/net/tcp: 12: 0100007F:1F90 0100007F:B210 08 00000000:00000001 00:00000000 00000000 0 0 0 1 0000000065057682 20 4 0 10 -1
/proc/net/tcp: 1: 00000000:1F90 00000000:0000 0A 00000000:00000001 00:00000000 00000000 1000 0 2734309 2 00000000ca212241 100 0 0 10 0
/proc/net/tcp: 2: 0100007F:B210 0100007F:1F90 05 00000000:00000000 03:0000167C 00000000 0 0 0 3 00000000793cfa31
/proc/net/tcp: 12: 0100007F:1F90 0100007F:B210 08 00000000:00000001 00:00000000 00000000 0 0 0 1 0000000065057682 20 4 0 10 -1
, stderr=/bin/sh: 1: nc: not found
/bin/sh: 1: nc: not found
/bin/bash: connect: Cannot assign requested address
/bin/bash: line 1: /dev/tcp/localhost/9360: Cannot assign requested address
/bin/sh: 1: nc: not found
/bin/bash: connect: Cannot assign requested address
/bin/bash: line 1: /dev/tcp/localhost/9360: Cannot assign requested address
/bin/sh: 1: nc: not found
/bin/bash: connect: Cannot assign requested address
/bin/bash: line 1: /dev/tcp/localhost/9360: Cannot assign requested address
/bin/sh: 1: nc: not found
/bin/bash: connect: Cannot assign requested address
/bin/bash: line 1: /dev/tcp/localhost/9360: Cannot assign requested address
/bin/sh: 1: nc: not found
/bin/bash: connect: Cannot assign requested address
/bin/bash: line 1: /dev/tcp/localhost/9360: Cannot assign requested address
/bin/sh: 1: nc: not found
/bin/bash: connect: Cannot assign requested address
/bin/bash: line 1: /dev/tcp/localhost/9360: Cannot assign requested address
/bin/sh: 1: nc: not found
/bin/bash: connect: Cannot assign requested address
/bin/bash: line 1: /dev/tcp/localhost/9360: Cannot assign requested address
/bin/sh: 1: nc: not found
/bin/bash: connect: Cannot assign requested address
/bin/bash: line 1: /dev/tcp/localhost/9360: Cannot assign requested address
/bin/sh: 1: nc: not found
/bin/bash: connect: Cannot assign requested address
/bin/bash: line 1: /dev/tcp/localhost/9360: Cannot assign requested address
/bin/sh: 1: nc: not found
/bin/bash: connect: Cannot assign requested address
/bin/bash: line 1: /dev/tcp/localhost/9360: Cannot assign requested address
/bin/sh: 1: nc: not found
/bin/bash: connect: Cannot assign requested address
/bin/bash: line 1: /dev/tcp/localhost/9360: Cannot assign requested address
/bin/sh: 1: nc: not found
/bin/bash: connect: Cannot assign requested address
/bin/bash: line 1: /dev/tcp/localhost/9360: Cannot assign requested address
/bin/sh: 1: nc: not found
/bin/bash: connect: Cannot assign requested address
/bin/bash: line 1: /dev/tcp/localhost/9360: Cannot assign requested address
/bin/sh: 1: nc: not found
/bin/bash: connect: Cannot assign requested address
/bin/bash: line 1: /dev/tcp/localhost/9360: Cannot assign requested address
/bin/sh: 1: nc: not found
/bin/bash: connect: Cannot assign requested address
/bin/bash: line 1: /dev/tcp/localhost/9360: Cannot assign requested address
/bin/sh: 1: nc: not found
/bin/bash: connect: Cannot assign requested address
/bin/bash: line 1: /dev/tcp/localhost/9360: Cannot assign requested address
/bin/sh: 1: nc: not found
/bin/bash: connect: Cannot assign requested address
/bin/bash: line 1: /dev/tcp/localhost/9360: Cannot assign requested address
)
This makes the InternalCommandPortListeningCheck incorrectly assume that the container is listening to this port (9360), so it presumes the container is working correctly.
TL;DR
I get the feeling here that the logic is somewhat flawed (or should we say "limited") in Testcontainers currently. If the container fails to startup, this doesn't cancel any waiting WaitStrategy asynchronous operations. Instead, Testcontainers will happily wait until the WaitStrategies are done and only then will the original exception (java.lang.IllegalStateException: Container exited with code 1) be propagated to the caller. This causes unnecessary waiting time; it would be much better if the WaitStrategies could be immediately cancelled and the exception be re-thrown on the calling thread.
Relevant log output
[null] INFO 2022-10-27 15:00:18.022 [?.h.f.5.0.1-java17]] :: Container docker.example.com/example/example-tomcat-8.5:2208.0.1-java17 is starting: 47600b9ebd950dba0b26078b83e9fc1efcc3a2fc18761b48fc4a4b0fcc326d8d
[null] ERROR 2022-10-27 15:00:28.536 [?.h.f.5.0.1-java17]] :: Could not start container
java.lang.IllegalStateException: Container exited with code 1
at org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:514)
at org.testcontainers.containers.GenericContainer.lambda$doStart$0(GenericContainer.java:344)
at org.rnorth.ducttape.unreliables.Unreliables.retryUntilSuccess(Unreliables.java:81)
at org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:334)
at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:322)
at com.example.test.integration.shared.ExampleWebappsTomcatContainer.start(ExampleWebappsTomcatContainer.java:196)
at com.example.test.integration.shared.ExampleWebapps.startTomcat(ExampleWebapps.java:383)
at com.example.test.integration.shared.ExampleWebapps.lambda$start$3(ExampleWebapps.java:132)
at java.base/java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:718)
at java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:482)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: org.testcontainers.containers.ContainerLaunchException: Container startup failed
at org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:349)
at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:322)
at com.example.test.integration.shared.ExampleWebappsTomcatContainer.start(ExampleWebappsTomcatContainer.java:196)
at com.example.test.integration.shared.ExampleWebapps.startTomcat(ExampleWebapps.java:383)
at com.example.test.integration.shared.ExampleWebapps.lambda$start$3(ExampleWebapps.java:132)
at java.base/java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:718)
at java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:482)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: org.rnorth.ducttape.RetryCountExceededException: Retry limit hit with exception
at org.rnorth.ducttape.unreliables.Unreliables.retryUntilSuccess(Unreliables.java:88)
at org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:334)
... 9 more
Caused by: org.testcontainers.containers.ContainerLaunchException: Could not create/start container
at org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:542)
at org.testcontainers.containers.GenericContainer.lambda$doStart$0(GenericContainer.java:344)
at org.rnorth.ducttape.unreliables.Unreliables.retryUntilSuccess(Unreliables.java:81)
... 10 more
Caused by: java.lang.IllegalStateException: Container exited with code 1
at org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:514)
... 12 moreAdditional Information
No response