Skip to content

Refactor: replace config overlays with CLI flags, align defaults.conf#447

Merged
spreston8 merged 12 commits intorust/devfrom
chore/docker-config-cli-flags
Mar 26, 2026
Merged

Refactor: replace config overlays with CLI flags, align defaults.conf#447
spreston8 merged 12 commits intorust/devfrom
chore/docker-config-cli-flags

Conversation

@spreston8
Copy link
Copy Markdown
Collaborator

@spreston8 spreston8 commented Mar 22, 2026

Summary

  • Config consolidation: Replace role-specific config files with 2 unified configs (default.conf, standalone-dev.conf). Per-role behavior controlled via CLI flags (--ceremony-master-mode, --heartbeat-disabled, --genesis-validator).
  • Compose updates: All compose files use ${VAR:-default} fallbacks, unified f1r3fly network name (name: f1r3fly), F1R3_* runtime tuning env vars in x-rnode anchor.
  • Monitoring stack alignment: Added shard-monitoring.yml as separate overlay (cAdvisor + Prometheus + Grafana). Prometheus uses DNS-based service discovery — no false DOWN targets for standalone mode. Recording rules loaded via rule_files. block-transfer.json moved to provisioning directory. f1r3node.json synced with system-integration (24KB with cAdvisor panels).
  • Defaults: synchrony-constraint-threshold changed to 0 across docker configs, defaults.conf, and documentation.
  • Docs: Root README and docker/README rewritten with monitoring section, port mapping table, compose file reference, aligned section order with Scala.
  • CI: Clones system-integration at ref: main (PR Add observers node deployment and fix bods.txt/wallets.txt in Helm #35 merged).

Related PRs

Test plan

  • docker compose -f docker/shard.yml up -d starts shard with all nodes reaching Running
  • docker compose -f docker/shard-monitoring.yml up -d starts monitoring stack
  • Prometheus targets: all running nodes UP, 0 DOWN
  • Grafana: 2 dashboards auto-provisioned (F1R3FLY Node + Block Transfer)
  • docker compose -f docker/standalone.yml up -d + monitoring: 1 UP, 0 DOWN
  • Integration tests pass on self-hosted runners (amd64 + arm64)

Co-Authored-By: Claude noreply@anthropic.com

Follow-up PRs

  • Validator bonding & additional validators: Support for bonding new validators and adding additional validator nodes to a running shard (validator4, etc.) is planned for a follow-up PR.
  • F1R3_ env var handling*: Current approach of inlining 40+ F1R3_* env vars in compose YAML environment: sections with ${VAR:-default} fallbacks is fragile. Follow-up to evaluate whether .env should be the single source, or document the split clearly.
  • Rust metric dashboard queries (f1r3node#405): Phase 1 observability gauges add 16 new Prometheus metrics that need new Grafana dashboard panels.

Co-Authored-By: Claude <noreply@anthropic.com>
spreston8 and others added 5 commits March 23, 2026 14:15
…ate Scala

Co-Authored-By: Claude <noreply@anthropic.com>
TODO: Revert to ref: main after system-integration PR #35 merges

Co-Authored-By: Claude <noreply@anthropic.com>
…, and docs

Co-Authored-By: Claude <noreply@anthropic.com>
- Add shard-monitoring.yml with cAdvisor + Prometheus + Grafana as a
  separate overlay compose file (monitoring was previously embedded in
  shard.yml).

- Switch Prometheus from static_configs to dns_sd_configs for node
  discovery. Only running nodes get scraped — no false DOWN targets
  for standalone mode.

- Add rule_files and cadvisor scrape job to prometheus.yml.

- Move block-transfer.json into Grafana provisioning directory so it
  is auto-discovered alongside f1r3node.json.

- Sync f1r3node.json dashboard with system-integration (24KB version
  with cAdvisor memory/CPU panels).

- Update README and docker/README with monitoring section, fix
  duplicate monitoring heading, fix compose files table.

Co-Authored-By: Claude <noreply@anthropic.com>
spreston8 and others added 3 commits March 24, 2026 19:38
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Stop commands now include shard-monitoring.yml down before shard down,
so monitoring containers (prometheus, grafana, cadvisor) are cleaned up.

Co-Authored-By: Claude <noreply@anthropic.com>
@spreston8 spreston8 marked this pull request as ready for review March 25, 2026 06:01
Comment thread scripts/run-standalone-dev.sh
spreston8 and others added 3 commits March 25, 2026 21:40
…tion

- Remove monitoring teardown from shard commands block and Quick Start
  Stop (monitoring not introduced yet at those points)
- Add stop command to Monitoring section (self-contained start + stop)

Co-Authored-By: Claude <noreply@anthropic.com>
PR #35 merged — no longer need to clone the feature branch.

Co-Authored-By: Claude <noreply@anthropic.com>
Script runs the hybrid Scala+Rust node via JAR. The pure Rust node
uses `just run-standalone` instead.

Co-Authored-By: Claude <noreply@anthropic.com>
@spreston8
Copy link
Copy Markdown
Collaborator Author

Merging. All review items will be addressed in follow-up PR.

@spreston8 spreston8 merged commit 577749d into rust/dev Mar 26, 2026
17 checks passed
@spreston8 spreston8 deleted the chore/docker-config-cli-flags branch March 26, 2026 06:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants