Overview
Extend the Instana plugins architecture to support custom metrics collection beyond the default process monitoring metrics, with TOML-based configuration and multiple OpenTelemetry metric types.
Current Limitations
Fixed Metric Set: All services currently collect the same predefined process metrics:
- CPU/memory usage, process count, disk I/O, file descriptors, thread counts, context switches
Single Metric Type: All metrics are implemented as OpenTelemetry Observable Gauges, regardless of their semantic meaning
No Extensibility: Services cannot register application-specific or business metrics
No Configuration: The metric_types field in plugin.toml is not actively used for customization
Proposed Solution
1. TOML-Based Custom Metrics Configuration
Extend plugin.toml to support custom metric definitions:
[custom_metrics]
enabled = true
[[custom_metrics.sources]]
name = "jmx_collector"
type = "jmx"
endpoint = "localhost:9999"
metrics = [
{name = "heap_memory", otel_type = "gauge", unit = "bytes"},
{name = "thread_pool_size", otel_type = "updowncounter", unit = "1"}
]
[[custom_metrics.sources]]
name = "log_parser"
type = "log_file"
path = "/var/log/app.log"
metrics = [
{name = "error_count", otel_type = "counter", pattern = "ERROR"},
{name = "response_time", otel_type = "histogram", pattern = "Transaction.*completed in (\\d+)ms", buckets = [10, 50, 100, 500, 1000]}
]
[[custom_metrics.sources]]
name = "http_endpoint"
type = "http"
url = "http://localhost:8080/metrics"
format = "json"
interval = 30
metrics = [
{name = "active_sessions", otel_type = "gauge", json_path = "$.sessions.active"},
{name = "requests_per_second", otel_type = "counter", json_path = "$.requests.total"}
]
2. Multiple OpenTelemetry Metric Types
Support all OpenTelemetry metric instruments:
Counter (monotonically increasing):
counter = self.meter.create_counter("requests_total")
counter.add(1, {"method": "GET"})
UpDownCounter (can increase/decrease):
updown = self.meter.create_up_down_counter("active_connections")
updown.add(1) # connection opened
Histogram (distributions/latencies):
histogram = self.meter.create_histogram("request_duration_ms")
histogram.record(125.3, {"endpoint": "/api/users"})
Observable Gauge (current implementation):
gauge = self.meter.create_observable_gauge("cpu_usage")
3. Pluggable Metric Source System
Built-in Source Types:
jmx: Java Management Extensions
log_file: Log file parsing with regex patterns
http: HTTP/REST endpoint polling
database: SQL query execution
file: File-based metrics (JSON, CSV, etc.)
command: Execute shell commands
Custom Source Plugins: Allow third-party metric collectors
4. Architecture Changes Required
New Components:
MetricSourceRegistry: Manage metric source plugins
CustomMetricCollector: Orchestrate custom metric collection
MetricSourceBase: Abstract base class for metric sources
ConfigValidator: Validate TOML metric configurations
Modified Components:
base_sensor.py: Integrate custom metrics into monitoring loop
otel_connector.py: Support multiple OpenTelemetry metric types
toml_utils.py: Parse and validate custom metrics configuration
metadata_store.py: Store custom metric metadata
5. Implementation Phases
Phase 1: Core Infrastructure
- Metric source plugin system
- TOML configuration parsing
- Multiple OTel metric type support
Phase 2: Basic Source Types
- HTTP endpoint polling
- Log file parsing
- Command execution
Phase 3: Advanced Sources
- JMX integration
- Database connectivity
- File-based metrics
Phase 4: Advanced Features
- Custom aggregation functions
- Metric transformations
- Conditional collection
Impact Assessment
Complexity: Major architectural change requiring significant development and testing effort
Backward Compatibility: Must maintain existing functionality without breaking changes
Performance: Custom metric collection could impact monitoring overhead
Dependencies: May require additional Python packages (JMX, database drivers, etc.)
Success Criteria
- Services can define custom metrics via TOML configuration
- Support for all OpenTelemetry metric types
- Pluggable architecture for metric sources
- Zero impact on existing services
- Comprehensive test coverage for all source types
- Performance benchmarks showing acceptable overhead
Target Release
Proposed for v1.0.0 as a major feature release after psutil migration stabilization.
Dependencies
- Completion of psutil migration (recently completed)
- Stable OpenTelemetry integration
- TOML configuration framework enhancements
Related Work
This feature request emerged from discussions about extending the current process monitoring capabilities to support application-specific metrics and business KPIs that go beyond standard system metrics.
Overview
Extend the Instana plugins architecture to support custom metrics collection beyond the default process monitoring metrics, with TOML-based configuration and multiple OpenTelemetry metric types.
Current Limitations
Fixed Metric Set: All services currently collect the same predefined process metrics:
Single Metric Type: All metrics are implemented as OpenTelemetry Observable Gauges, regardless of their semantic meaning
No Extensibility: Services cannot register application-specific or business metrics
No Configuration: The
metric_typesfield inplugin.tomlis not actively used for customizationProposed Solution
1. TOML-Based Custom Metrics Configuration
Extend
plugin.tomlto support custom metric definitions:2. Multiple OpenTelemetry Metric Types
Support all OpenTelemetry metric instruments:
Counter (monotonically increasing):
UpDownCounter (can increase/decrease):
Histogram (distributions/latencies):
Observable Gauge (current implementation):
3. Pluggable Metric Source System
Built-in Source Types:
jmx: Java Management Extensionslog_file: Log file parsing with regex patternshttp: HTTP/REST endpoint pollingdatabase: SQL query executionfile: File-based metrics (JSON, CSV, etc.)command: Execute shell commandsCustom Source Plugins: Allow third-party metric collectors
4. Architecture Changes Required
New Components:
MetricSourceRegistry: Manage metric source pluginsCustomMetricCollector: Orchestrate custom metric collectionMetricSourceBase: Abstract base class for metric sourcesConfigValidator: Validate TOML metric configurationsModified Components:
base_sensor.py: Integrate custom metrics into monitoring loopotel_connector.py: Support multiple OpenTelemetry metric typestoml_utils.py: Parse and validate custom metrics configurationmetadata_store.py: Store custom metric metadata5. Implementation Phases
Phase 1: Core Infrastructure
Phase 2: Basic Source Types
Phase 3: Advanced Sources
Phase 4: Advanced Features
Impact Assessment
Complexity: Major architectural change requiring significant development and testing effort
Backward Compatibility: Must maintain existing functionality without breaking changes
Performance: Custom metric collection could impact monitoring overhead
Dependencies: May require additional Python packages (JMX, database drivers, etc.)
Success Criteria
Target Release
Proposed for v1.0.0 as a major feature release after psutil migration stabilization.
Dependencies
Related Work
This feature request emerged from discussions about extending the current process monitoring capabilities to support application-specific metrics and business KPIs that go beyond standard system metrics.