Skip to content

[BUG]: Agent stuck in the Post-Jobs phase when running on a Windows container in AKS #5218

Open
@cata008

Description

@cata008

What happened?

The agent is stuck in the post job phase. Checking the worker logs reveals that it seems to experience some intermittent network issues when trying to connect to Azure DevOps (see log output).

Running Test-NetConnection confirms the intermittent network issues.

Deploying a test container that uses the same base image, in the same namespace, shows that the issue is not namespace related as it cannot be reproduced.

Killing the Agent.Listener process unblocks the connection to vsblobprodsu6weu.vsblob.visualstudio.com.

We have agent version 4.252.0 running on VMSS and there we do not experience any issues. Any idea what could cause this ?

Versions

Agent version: 4.255.0 & 4.252.0
Container base image: mcr.microsoft.com/windows/servercore:ltsc2022

Environment type (Please select at least one enviroment where you face this issue)

  • Self-Hosted
  • Microsoft Hosted
  • VMSS Pool
  • Container

Azure DevOps Server type

dev.azure.com (formerly visualstudio.com)

Azure DevOps Server Version (if applicable)

No response

Operation system

Windows Server 2022 Datacenter

Version controll system

Azure Git

Relevant log output

Attempt 1 of GET request to https://spsprodweu4.vssps.visualstudio.com/../_apis/connectionData failed (Socket Error: TimedOut). The operation will be retried in 11.0747296 seconds.
Attempt 1 of OPTIONS request to https://vsblobprodsu6weu.vsblob.visualstudio.com/../_apis/ failed (Socket Error: TimedOut). The operation will be retried in 11.126801 seconds.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions