Skip to content

Use label values API for efficient wildcard resource discovery#110

Open
cnewkirk wants to merge 4 commits intoOpenNMS:masterfrom
cnewkirk:feature/label-values-discovery
Open

Use label values API for efficient wildcard resource discovery#110
cnewkirk wants to merge 4 commits intoOpenNMS:masterfrom
cnewkirk:feature/label-values-discovery

Conversation

@cnewkirk
Copy link
Copy Markdown
Contributor

Summary

  • Label values discovery: two-phase approach using /api/v1/label/resourceId/values for resourceId enumeration, then batched /series queries with exact-match alternation. Configurable via useLabelValuesForDiscovery (default false) and discoveryBatchSize (default 50).
  • E2E test harness (e2e/): 45 smoke tests across 12 sections, supporting both Prometheus and Thanos backends via docker-compose profiles. Single-command orchestration via run-e2e.sh.
  • GitHub Actions CI: runs E2E against both backends on every PR, with failure log collection.
  • CLAUDE.md: project conventions and E2E testing guardrails.
  • AGPL v3 license headers added to new Java files.

Test plan

  • Unit tests pass (14/14): mvn test
  • E2E passes against Thanos: ./e2e/run-e2e.sh --backend thanos
  • E2E passes against Prometheus: ./e2e/run-e2e.sh --backend prometheus
  • GitHub Actions CI runs on this PR (both backends)

Chance Newkirk added 4 commits March 15, 2026 18:03
Two-phase discovery: /api/v1/label/resourceId/values for resourceId
enumeration, then batched /series queries with exact-match alternation
for metric retrieval. Configurable via useLabelValuesForDiscovery
(default false) and discoveryBatchSize (default 50).

Includes E2E test harness (45 tests, Prometheus + Thanos profiles),
architecture docs, and AGPL license headers on new files.
Codifies guardrails against recurring regressions: pinned image
versions, no SNAPSHOT testing, observable-behavior assertions,
30s collection intervals, and scope discipline.
- run-e2e.sh: single command for build, deploy, start, test, teardown
- e2e.yml: runs 45 smoke tests on PR against both Prometheus and Thanos
- CLAUDE.md updated with automation docs
- Correct writeTimeoutInMs/readTimeoutInMs examples: 1000 -> 5000 (actual defaults)
- Rename bulkheadMaxWaitDurationInMs -> bulkheadMaxWaitDuration to match OSGi property name in blueprint.xml
- Remove stale integration test count (~15 -> varies)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant