Background
At very high RPS the main thread's receive_metrics() loop becomes a bottleneck: it processes raw per-request GooseMetric messages and has a hard 400ms budget per 500ms main-loop iteration. Channel volume scales linearly with RPS (O(RPS)), which limits single-instance throughput.
Goose already solves this problem in Gaggle (distributed) mode: worker nodes pre-aggregate metrics locally and send periodic summaries to the manager, rather than forwarding individual request events. This ticket explores bringing the same approach to standalone mode.
Key observation: logging is unaffected
Per-request log files (--request-log, --transaction-log, --scenario-log, --error-log) already travel through an entirely separate GooseLog channel to the logger thread. Pre-aggregating the metrics channel would have zero impact on logging granularity. Full per-request detail would remain available in log files as today.
What we'd be trading
With pre-aggregation, the metrics channel would carry periodic summaries (e.g., every N requests or every T ms) instead of individual events. The main thread merges these summaries the same way it currently merges raw events. Statistical accuracy is preserved — histograms, percentiles, and counts are all mergeable. The only difference is that the main thread sees data in batches, not per-request, which means:
- Running metrics (
--running-metrics) reflect slightly lagged counts (bounded by flush interval)
- No per-request data visible through the metrics channel — but this is already the case; that data was always in the log files
Proposed approach
-
Add a GooseMetricBatch struct mirroring GooseRequestMetricAggregate but sized for a single user's interval:
struct GooseMetricBatch {
request_key: String, // "<METHOD> <name>"
success_count: u64,
fail_count: u64,
response_times: BTreeMap<usize, usize>,
status_codes: HashMap<u16, u64>,
// ... plus transaction/scenario counts
errors: Vec<GooseRequestMetric>, // errors still sent individually
}
-
Each GooseUser accumulates a GooseMetricBatch locally and flushes it to the channel every N requests (e.g., 100) or every T ms (e.g., 250ms), whichever comes first.
-
The main thread receives GooseMetric::Batch(GooseMetricBatch) and merges it into the existing GooseRequestMetricAggregate structures — identical to how Gaggle manager merges worker reports.
-
Error events (non-2xx responses) continue to be sent individually so the error summary retains full fidelity. This mirrors Gaggle behavior.
Questions to resolve
- What flush interval / batch size gives the right tradeoff between channel pressure and metrics freshness for
--running-metrics?
- Does the Gaggle aggregation path (
src/metrics.rs merge logic) already provide a clean abstraction to reuse, or does it need to be factored out first?
- Should batching be opt-in (e.g.,
--batch-metrics) or the new default?
Related
- Gaggle worker→manager aggregation in
src/metrics.rs is the closest existing reference implementation.
- This is a larger change than the other metrics performance tickets and would benefit from a prototype branch to measure actual throughput improvement before committing to the design.
Background
At very high RPS the main thread's
receive_metrics()loop becomes a bottleneck: it processes raw per-requestGooseMetricmessages and has a hard 400ms budget per 500ms main-loop iteration. Channel volume scales linearly with RPS (O(RPS)), which limits single-instance throughput.Goose already solves this problem in Gaggle (distributed) mode: worker nodes pre-aggregate metrics locally and send periodic summaries to the manager, rather than forwarding individual request events. This ticket explores bringing the same approach to standalone mode.
Key observation: logging is unaffected
Per-request log files (
--request-log,--transaction-log,--scenario-log,--error-log) already travel through an entirely separateGooseLogchannel to the logger thread. Pre-aggregating the metrics channel would have zero impact on logging granularity. Full per-request detail would remain available in log files as today.What we'd be trading
With pre-aggregation, the metrics channel would carry periodic summaries (e.g., every N requests or every T ms) instead of individual events. The main thread merges these summaries the same way it currently merges raw events. Statistical accuracy is preserved — histograms, percentiles, and counts are all mergeable. The only difference is that the main thread sees data in batches, not per-request, which means:
--running-metrics) reflect slightly lagged counts (bounded by flush interval)Proposed approach
Add a
GooseMetricBatchstruct mirroringGooseRequestMetricAggregatebut sized for a single user's interval:Each
GooseUseraccumulates aGooseMetricBatchlocally and flushes it to the channel every N requests (e.g., 100) or every T ms (e.g., 250ms), whichever comes first.The main thread receives
GooseMetric::Batch(GooseMetricBatch)and merges it into the existingGooseRequestMetricAggregatestructures — identical to how Gaggle manager merges worker reports.Error events (non-2xx responses) continue to be sent individually so the error summary retains full fidelity. This mirrors Gaggle behavior.
Questions to resolve
--running-metrics?src/metrics.rsmerge logic) already provide a clean abstraction to reuse, or does it need to be factored out first?--batch-metrics) or the new default?Related
src/metrics.rsis the closest existing reference implementation.