Skip to content

Conversation

elohmeier
Copy link
Contributor

@elohmeier elohmeier commented Sep 17, 2025

Summary

This PR adds the ability to expose Vector's component topology graph as metrics through the internal_metrics source. Each connection between components is exposed as a component_connections gauge metric with labels indicating the source and target components.

Using this data users can plot the topology combined with the existing metrics.

I've tested this with a Prometheus instance scraping a prometheus_exporter sink (see Vector config below) and then visualized this in Grafana Node Graph. Added some screenshots below to see the example configuration in Grafana.

Screenshot 2025-09-17 at 07 07 17 Screenshot 2025-09-17 at 07 08 53 Screenshot 2025-09-17 at 07 09 09

Vector configuration

expire_metrics_secs = 60

[api]
enabled = true
address = "0.0.0.0:8686"

# Sources
[sources.demo]
type = "demo_logs"
format = "json"
interval = 1.0

[sources.pushgateway]
type = "prometheus_pushgateway"
address = "0.0.0.0:9091"
aggregate_metrics = true

[sources.internal_metrics]
type = "internal_metrics"

# Transforms
[transforms.parse_logs]
type = "remap"
inputs = ["demo"]
source = '''
.parsed = parse_json!(.message)
.level = .parsed.level
.service = .parsed.service || "unknown"
'''

[transforms.filter_errors]
type = "filter"
inputs = ["parse_logs"]
condition = '.level == "error" || .level == "ERROR"'

[transforms.filter_warnings]
type = "filter"
inputs = ["parse_logs"]
condition = '.level == "warn" || .level == "WARN" || .level == "warning" || .level == "WARNING"'

[transforms.route_by_level]
type = "route"
inputs = ["parse_logs"]

[transforms.route_by_level.route]
error = '.level == "error" || .level == "ERROR"'
warning = '.level == "warn" || .level == "WARN"'
info = '.level == "info" || .level == "INFO"'
debug = '.level == "debug" || .level == "DEBUG"'

[transforms.log_to_metric]
type = "log_to_metric"
inputs = ["parse_logs"]

[[transforms.log_to_metric.metrics]]
type = "counter"
field = "message"
name = "log_count"
tags = { level = "{{level}}", service = "{{service}}" }

[[transforms.log_to_metric.metrics]]
type = "histogram"
field = "parsed.response_time"
name = "response_time_ms"
tags = { service = "{{service}}" }

[transforms.aggregate_metrics]
type = "aggregate"
inputs = ["log_to_metric"]
interval_ms = 10000

# Sinks
[sinks.console]
type = "console"
inputs = ["parse_logs"]
encoding.codec = "text"

[sinks.console_errors]
type = "console"
inputs = ["route_by_level.error", "filter_errors"]
encoding.codec = "json"
encoding.json.pretty = true

[sinks.console_warnings]
type = "console"
inputs = ["route_by_level.warning", "filter_warnings"]
encoding.codec = "json"
encoding.json.pretty = true

[sinks.prometheus]
type = "prometheus_exporter"
address = "0.0.0.0:9102"
inputs = ["aggregate_metrics", "pushgateway", "internal_metrics"]

This is the new data returned by the prometheus_exporter for the above config:

# HELP vector_component_connections component_connections
# TYPE vector_component_connections gauge
vector_component_connections{from_component_id="parse_logs",from_component_kind="transform",from_component_type="remap",to_component_id="route_by_level",to_component_kind="transform",to_component_type="route"} 1 1758086163548
vector_component_connections{from_component_id="parse_logs",from_component_kind="transform",from_component_type="remap",to_component_id="log_to_metric",to_component_kind="transform",to_component_type="log_to_metric"} 1 1758086163548
vector_component_connections{from_component_id="parse_logs",from_component_kind="transform",from_component_type="remap",to_component_id="console",to_component_kind="sink",to_component_type="console"} 1 1758086163548
vector_component_connections{from_component_id="aggregate_metrics",from_component_kind="transform",from_component_type="aggregate",to_component_id="prometheus",to_component_kind="sink",to_component_type="prometheus_exporter"} 1 1758086163548
vector_component_connections{from_component_id="pushgateway",from_component_kind="source",from_component_type="prometheus_pushgateway",to_component_id="prometheus",to_component_kind="sink",to_component_type="prometheus_exporter"} 1 1758086163548
vector_component_connections{from_component_id="internal_metrics",from_component_kind="source",from_component_type="internal_metrics",to_component_id="prometheus",to_component_kind="sink",to_component_type="prometheus_exporter"} 1 1758086163548
vector_component_connections{from_component_id="route_by_level",from_component_kind="transform",from_component_type="route",from_output="warning",to_component_id="console_warnings",to_component_kind="sink",to_component_type="console"} 1 1758086163548
vector_component_connections{from_component_id="filter_warnings",from_component_kind="transform",from_component_type="filter",to_component_id="console_warnings",to_component_kind="sink",to_component_type="console"} 1 1758086163548
vector_component_connections{from_component_id="parse_logs",from_component_kind="transform",from_component_type="remap",to_component_id="filter_warnings",to_component_kind="transform",to_component_type="filter"} 1 1758086163548
vector_component_connections{from_component_id="log_to_metric",from_component_kind="transform",from_component_type="log_to_metric",to_component_id="aggregate_metrics",to_component_kind="transform",to_component_type="aggregate"} 1 1758086163548
vector_component_connections{from_component_id="demo",from_component_kind="source",from_component_type="demo_logs",to_component_id="parse_logs",to_component_kind="transform",to_component_type="remap"} 1 1758086163548
vector_component_connections{from_component_id="parse_logs",from_component_kind="transform",from_component_type="remap",to_component_id="filter_errors",to_component_kind="transform",to_component_type="filter"} 1 1758086163548
vector_component_connections{from_component_id="route_by_level",from_component_kind="transform",from_component_type="route",from_output="error",to_component_id="console_errors",to_component_kind="sink",to_component_type="console"} 1 1758086163548
vector_component_connections{from_component_id="filter_errors",from_component_kind="transform",from_component_type="filter",to_component_id="console_errors",to_component_kind="sink",to_component_type="console"} 1 1758086163548

How did you test this PR?

I've tested this using above setup with Grafana/Prometheus. I've also added unit tests and ran them. I manually tested the --watch-config functionality works with this (saw the prometheus_exporter output updating the component_connections metrics).

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

Notes

  • Please read our Vector contributor resources.
  • Do not hesitate to use @vectordotdev/vector to reach out to us regarding this PR.
  • Some CI checks run only after we manually approve them.
    • We recommend adding a pre-push hook, please see this template.
    • Alternatively, we recommend running the following locally before pushing to the remote branch:
      • make fmt
      • make check-clippy (if there are failures it's possible some of them can be fixed with make clippy-fix)
      • make test
  • After a review is requested, please avoid force pushes to help us review incrementally.
    • Feel free to push as many commits as you want. They will be squashed into one before merging.
    • For example, you can run git merge origin master and git push.
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run make build-licenses to regenerate the license inventory and commit the changes (if any). More details here.

@elohmeier elohmeier requested a review from a team as a code owner September 17, 2025 05:12
@github-actions github-actions bot added domain: topology Anything related to Vector's topology code domain: sources Anything related to the Vector's sources labels Sep 17, 2025
Adds the ability to expose Vector's component topology graph as
metrics through the internal_metrics source. Each connection
between components is represented as a component_connections gauge
metric with labels for source and target components.
@elohmeier
Copy link
Contributor Author

Fixed the clippy issues.

@pront
Copy link
Member

pront commented Sep 17, 2025

Hi @elohmeier, thank you for taking the time to prepare this PR. I don't see an issue associated with this and I think this idea needs a bit more discussion. Also, have you used the graph Vector command? It can output the topology in well known formats.

@elohmeier
Copy link
Contributor Author

Hi @pront, thanks for reviewing! The graph command provides static exports, but my use case requires real-time topology monitoring alongside metrics. In production, I need to correlate topology structure with throughput/bottlenecks in Grafana dashboards (see screenshots).

Converting graph output to a line-based format (like CSV) and ingesting it separately from metrics adds complexity. This PR makes topology data available directly in the metrics store where it:

  • Updates automatically with config reloads
  • Displays in Grafana Node Graph with metrics overlay
  • Requires zero additional infrastructure

Would you prefer I open an issue to discuss the approach? The implementation is minimal - just exposing the topology data Vector already maintains internally.

@johnhtodd
Copy link

This is quite interesting, especially because of the dynamic nature of the flows (because of "route" and other non-static methods of changing the path based on events.) It almost seems like some basic dashboards should be included in the distribution in a contrib directory, as well - these are complicated concepts and having an example included with the code would be useful. It would be really interesting to see some sort of path coloration or size of path arrow based on number of events as another demonstration dashboard, but I'm not familiar enough with the node graph output in Grafana to know if that's possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: sources Anything related to the Vector's sources domain: topology Anything related to Vector's topology code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants