Skip to content
This repository was archived by the owner on Feb 7, 2024. It is now read-only.
This repository was archived by the owner on Feb 7, 2024. It is now read-only.

Spurious numbers in metrics #430

@mgrabovsky

Description

@mgrabovsky

In the 48 hours following the deployment of the Prometheus metrics endpoint, at least two bugs have been made apparent thanks to the Grafana dashboard:

  1. Failed tasks often (but not always) seem to be counted twice in retrace_tasks_finished{result="fail"}.
  2. The number of running tasks (retrace_tasks_running) sporadically jumps up to wild numbers, such as 70, 18 or 39, for a few minutes at a time. The maximum allowed number of running tasks (MaxParallelTasks) is 12 on retrace.fp.org, so these numbers make no sense.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugfeature:metricsFeatures and bugs related to the metrics and monitoring subsystem.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions