Skip to content

7116990: (spec) Socket.connect(addr,timeout) not clear if IOException because of TCP timeout #25690

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

jaikiran
Copy link
Member

@jaikiran jaikiran commented Jun 9, 2025

Can I please get a review of this doc-only change which proposes to add a @apiNote to the Socket.connect(SocketAddress endpoint, int timeout) method? This addresses https://bugs.openjdk.org/browse/JDK-7116990.

As noted in that issue, users can find it surprising that when the Socket.connect(...) method is called with a timeout value, then if that timeout value happens to be greater than the connect timeout that operating systems typically impose, then a IOException gets thrown instead of the SocketTimeoutException. The change in this PR proposes to add a @apiNote which explains this current behaviour.

If this requires a CSR, I'll open one once we settle on the proposed text.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change requires CSR request JDK-8359249 to be approved

Issues

  • JDK-7116990: (spec) Socket.connect(addr,timeout) not clear if IOException because of TCP timeout (Bug - P3)
  • JDK-8359249: (spec) Socket.connect(addr,timeout) not clear if IOException because of TCP timeout (CSR)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25690/head:pull/25690
$ git checkout pull/25690

Update a local copy of the PR:
$ git checkout pull/25690
$ git pull https://git.openjdk.org/jdk.git pull/25690/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 25690

View PR using the GUI difftool:
$ git pr show -t 25690

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25690.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jun 9, 2025

👋 Welcome back jpai! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jun 9, 2025

@jaikiran This change is no longer ready for integration - check the PR body for details.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jun 9, 2025
@openjdk
Copy link

openjdk bot commented Jun 9, 2025

@jaikiran The following label will be automatically applied to this pull request:

  • net

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@mlbridge
Copy link

mlbridge bot commented Jun 9, 2025

Webrevs

@@ -621,6 +621,12 @@ public void connect(SocketAddress endpoint) throws IOException {
* {@code SocketException} with the interrupt status set.
* </ol>
*
* @apiNote Establishing a TCP/IP connection is subject to connection timeout settings
* in the operating system. The typical timeout is 60 seconds. If the operating system
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* in the operating system. The typical timeout is 60 seconds. If the operating system
* in the operating system. The typical operating system timeout is 60 seconds. If the operating system

I would suggest repeating "operating system timeout" here too, to remove confusion with the simple API timeout which also appears later in this paragraph.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link

@msheppar msheppar Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW:
Stating a typical value of 60 seconds timeout can lead to a misconception or set an expectation ... From from TCP standards and depending on which literature you read (OS docs or unix networking socket programming) then 75 secs should be a more typical default

I think the 60 seconds comes from a perceived setting on linux. For example if a linux config of
net.ipv4.tcp_syn_retries = 6 is set and the RTO == 1 sec, with a backoff policy of doubling the timeout each retry, then the connect timeout would expect to be 63 secs

It would be better to say that, the value is OS dependent, influenced by OS network setting relating to syn receive timeouts and the number of syn retries, and governed by the TCP retransmission timer implementation, rather than stating a particular value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Mark,
Alan's thought was that it might be OK to have that sentence about the typical 60 second timeout. The primary guidance to developers here is that "The {@code timeout} specified to this method is typically a timeout value that is shorter than the operating system timeout." so that they set a lower value when appropriate.

Alan @AlanBateman, do you suggest we continue with this text or would any update be necessary?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is an unnecessary quantification, is somewhat inaccurate, and set an expectation of a developer that this is gospel or axiomatic. Indicating that it is OS dependent should be sufficient.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alan @AlanBateman, do you suggest we continue with this text or would any update be necessary?

I think it's helpful here to give some indication in this API note as what the timeout might be. It doesn't really matter if it says 60s or 75s, the point is that establishing a TCP connection is subject to a timeout imposed by the operating system. It helps for cases where someone calls connect with a timeout of say 300_000 (5 minutes) and is surprised to get ConnectException "Operation timed out" after a minute or so. This is exactly what prompted JDK-7116990, someone called connect with a timeout that is larger than the OS configured timeout.

Copy link

@msheppar msheppar Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this on the 3 main OS platforms (Windows, OL and macOS) and running a simple test to trigger a connect timeout then finding are

macOS the connect timeout is 75 seconds as this is derived from 4.3 BSD

If OL linux has a syn retries set to 6 and an RTO == 1 sec then it’s connect timeout is 130 seconds approx
That’s initial syn timeout of 1 seconds and then 6 retries totalling 127 seconds

Windows by default connect timeout is 21 seconds
RTT == 3 seconds and two connect retries
Initial syn timeout 3 seconds + 2 retries at 6 and 12 seconds == 21 seconds

Copy link

@msheppar msheppar Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW an possible alternative wording

Establishing a TCP/IP connection is subject to a connect timeout, determined by configuration settings
of the operating system, for example the number of connect retries together with a TCP/IP implementation’s back off strategy. Thus, TCP/IP Connect timeout values can vary across operating system, for example 75 seconds on macOS, with a default value of 21 seconds on Windows system. As such, the operating system TCP/IP Connect timeout may expire before that specified to the connect method. If the operating system Connect timeout expires
before the {@code timeout} specified to this method then an {@code IOException} is thrown. The {@code timeout}
specified to this method is typically a timeout value that is shorter than the operating system timeout.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Mark, I discussed this with Alan and based on those discussions I have now reworded that sentence to make it clear that 60 second isn't the only typical timeout and at the same time keep the text concise.

Copy link
Contributor

@AlanBateman AlanBateman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jun 9, 2025
Copy link
Member

@dfuch dfuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this requires a CSR. It might - if only for the sake of clarifying expectations for JCK too.

@jaikiran
Copy link
Member Author

/csr

@openjdk openjdk bot added the csr Pull request needs approved CSR before integration label Jun 11, 2025
@openjdk
Copy link

openjdk bot commented Jun 11, 2025

@jaikiran has indicated that a compatibility and specification (CSR) request is needed for this pull request.

@jaikiran please create a CSR request for issue JDK-7116990 with the correct fix version. This pull request cannot be integrated until the CSR request is approved.

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Jun 11, 2025
@jaikiran
Copy link
Member Author

The CSR is now ready for review https://bugs.openjdk.org/browse/JDK-8359249

@AlanBateman
Copy link
Contributor

The CSR is now ready for review https://bugs.openjdk.org/browse/JDK-8359249

It's okay to have a CSR but no usually needed for API notes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
csr Pull request needs approved CSR before integration net [email protected] rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

4 participants