Avoid socket telemetry errors for pending nonblocking connects#129259
Avoid socket telemetry errors for pending nonblocking connects#129259tulior wants to merge 5 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR updates sockets telemetry to avoid treating non-blocking Connect “pending” results (WouldBlock/InProgress) as failures, and adds a regression test ensuring the emitted connect Activity is not marked as an error in that scenario.
Changes:
- Adjust
SocketsTelemetry.AfterConnectto skip error status/tagging and failure counting for non-blocking connect pending errors. - Add a functional test covering non-blocking connects that initially return pending but ultimately succeed.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/libraries/System.Net.Sockets/tests/FunctionalTests/TelemetryTest.cs | Adds a new test validating telemetry Activity is not marked error for pending non-blocking connect. |
| src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketsTelemetry.cs | Avoids reporting WouldBlock/InProgress as connect failures in telemetry. |
13d0f09 to
c06bcd5
Compare
c06bcd5 to
c92a337
Compare
Non-blocking Socket.Connect can return WouldBlock on Windows or InProgress on Unix while the OS connect attempt remains pending. Treat those results as pending telemetry states instead of failures. Leave the emitted connect Activity status unset, omit error.type, and avoid emitting ConnectFailed for these pending results. Add a regression test for the real synchronous Connect path so the pending result is verified through the public socket API.
9bea8f0 to
52e7103
Compare
|
Could someone from @dotnet/ncl please take a look when you have a chance? @karelz, tagging based on |
| await RemoteExecutor.Invoke(static () => | ||
| { | ||
| using Socket server = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp); | ||
| server.BindToAnonymousPort(IPAddress.Loopback); | ||
| server.Listen(); | ||
|
|
||
| using Socket client = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp) | ||
| { | ||
| Blocking = false | ||
| }; | ||
|
|
||
| using ActivityRecorder recorder = new ActivityRecorder(ActivitySourceName, ActivityName); | ||
|
|
||
| // A non-blocking connect reports WouldBlock (Windows) or InProgress (Unix) by design while | ||
| // the attempt continues in the background. The "socket connect" activity is started and | ||
| // stopped synchronously inside Connect, so its final state can be asserted immediately. | ||
| SocketException ex = Assert.Throws<SocketException>(() => client.Connect(server.LocalEndPoint)); | ||
| Assert.True(ex.SocketErrorCode is SocketError.WouldBlock or SocketError.InProgress, | ||
| $"Unexpected SocketError: {ex.SocketErrorCode}"); | ||
|
|
||
| recorder.VerifyActivityRecorded(1); | ||
| Activity activity = recorder.LastFinishedActivity; | ||
| VerifyTcpConnectActivity(activity, (IPEndPoint)server.LocalEndPoint, ipv6: false); | ||
| Assert.Equal(ActivityStatusCode.Unset, activity.Status); | ||
| Assert.Null(activity.GetTagItem("error.type")); | ||
| }).DisposeAsync(); | ||
| } |
| SocketException ex = Assert.Throws<SocketException>(() => client.Connect(server.LocalEndPoint)); | ||
| Assert.True(ex.SocketErrorCode is SocketError.WouldBlock or SocketError.InProgress, | ||
| $"Unexpected SocketError: {ex.SocketErrorCode}"); |
| [ConditionalFact(typeof(RemoteExecutor), nameof(RemoteExecutor.IsSupported))] | ||
| public async Task Connect_NonBlockingPending_ActivityNotMarkedAsError() | ||
| { | ||
| await RemoteExecutor.Invoke(static () => | ||
| { | ||
| using Socket server = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp); | ||
| server.BindToAnonymousPort(IPAddress.Loopback); | ||
| server.Listen(); | ||
|
|
||
| using Socket client = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp) | ||
| { | ||
| Blocking = false | ||
| }; | ||
|
|
||
| using ActivityRecorder recorder = new ActivityRecorder(ActivitySourceName, ActivityName); | ||
|
|
||
| // A non-blocking connect reports WouldBlock (Windows) or InProgress (Unix) by design while | ||
| // the attempt continues in the background. The "socket connect" activity is started and | ||
| // stopped synchronously inside Connect, so its final state can be asserted immediately. | ||
| SocketException ex = Assert.Throws<SocketException>(() => client.Connect(server.LocalEndPoint)); | ||
| Assert.True(ex.SocketErrorCode is SocketError.WouldBlock or SocketError.InProgress, | ||
| $"Unexpected SocketError: {ex.SocketErrorCode}"); | ||
|
|
||
| recorder.VerifyActivityRecorded(1); | ||
| Activity activity = recorder.LastFinishedActivity; | ||
| VerifyTcpConnectActivity(activity, (IPEndPoint)server.LocalEndPoint, ipv6: false); | ||
| Assert.Equal(ActivityStatusCode.Unset, activity.Status); | ||
| Assert.Null(activity.GetTagItem("error.type")); | ||
| }).DisposeAsync(); |
|
Thanks for your contribution @tulior! I'm inclined to merge this as-is, but CLA (Contributor License Agreement see: #129259 (comment)) is required, can you take a look on it? |
Absolutely. Happy to be a part of it. @dotnet-policy-service agree |
|
@dotnet-policy-service agree |
|
/azp run runtime-libraries-coreclr outerloop |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Fixes #129252.
Non-blocking Socket.Connect can report WouldBlock on Windows or InProgress on Unix while the OS connect attempt remains pending. Sockets telemetry treated that result as a failure even though Connect has only reported a pending state.
This keeps the connect Activity, but leaves the status unset and omits error.type for pending results. It also suppresses the corresponding ConnectFailed EventSource event for the same pending states. Other SocketError values keep the existing failure path.
The EventSource part is included because it is the same pending-result case at the event layer. It can be separated if reviewers prefer to keep this PR scoped only to the experimental Activity surface.
This does not change the existing counter limitation: a pending connect that later succeeds is still not counted in outgoing-connections-established. Before this change it was not counted either; the pending result was reported as a failure instead.
The regression test covers the real synchronous Socket.Connect path for a non-blocking loopback connect. It asserts immediately after WouldBlock or InProgress because the Activity starts and stops inside Connect, so no later network completion is needed to verify the telemetry state.
Validation run locally on Windows:
dotnet.cmd build /t:Test /p:XunitMethodName=System.Net.Sockets.Tests.TelemetryTest.Connect_NonBlockingPending_ActivityNotMarkedAsErrordotnet.cmd build /t:Test /p:XUnitClassName=System.Net.Sockets.Tests.TelemetryTest