-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
If there's an outage on GraphQL, we can get into a state where the bot is disconnected and won't reconnect. Need to figure out which error to look for and how to reset the connection.
Some hints from the logs:
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='graphql.dev.grid.tf', port=443): Max retries exceeded with url: /graphql (Caused by ResponseError('too many 503
error responses'))
requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='graphql.grid.tf', port=443): Max retries exceeded with url: /graphql (Caused by ConnectTimeoutError(<urllib3.c
onnection.HTTPSConnection object at 0x7f2b752d8b50>, 'Connection to graphql.grid.tf timed out. (connect timeout=None)'))
gql.transport.exceptions.TransportAlreadyConnected: Transport is already connected
The thing is that all of these errors can occur in the course of normal operation, and are normally recovered from. Maybe some further investigation in the logs can help see root cause.
Metadata
Metadata
Assignees
Labels
No labels