Skip to content

fix: sidecar heartbeat spams repeated errors without backoff #38

@tumberger

Description

@tumberger

Problem

The sidecar heartbeat loop runs on a fixed 30s ticker and logs every failure with log.Printf:

if err := s.client.Heartbeat(ctx, s.sessionID); err != nil {
    log.Printf("sidecar: heartbeat: %v", err)
}

When the network goes down (DNS failure, WiFi switch, laptop sleep/wake), this produces the same error message every 30 seconds for the rest of the session:

2026/04/09 10:41:17 sidecar: heartbeat: unavailable: token refresh: token expired and refresh failed: oauth discovery: discovery request: Get "https://api.kontext.security/.well-known/oauth-authorization-server": dial tcp: lookup api.kontext.security: no such host
2026/04/09 10:41:47 sidecar: heartbeat: unavailable: ...
2026/04/09 10:42:17 sidecar: heartbeat: unavailable: ...

Same issue applies to ingestEvent failures at line 122-124.

Expected behavior

  1. Exponential backoff on consecutive failures (30s → 60s → 120s → cap at 5min)
  2. Deduplicate logs — print the error on first occurrence and when it changes, not every tick
  3. Log recovery — print a success message when connectivity returns after a failure streak
  4. Reset backoff on success — return to the normal 30s interval once a heartbeat succeeds

Example output

sidecar: heartbeat failed: dial tcp: lookup api.kontext.security: no such host (retrying with backoff)
sidecar: heartbeat recovered after 3m20s

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions