Skip to content

Add batched/concurrent MSM benchmarks for real-world proving workloads #99

@moven0831

Description

@moven0831

Problem

Current benchmarks only measure single MSM performance at various input sizes (2^10 through 2^24). In real proving systems, MSM is called multiple times in parallel — for example, a 2^14 MSM called 2048 times concurrently during a single proving pass.

GPU may significantly outperform multi-threaded CPU in these batched scenarios due to:

  • Better utilization of parallel compute units when saturated with concurrent work
  • Amortization of setup overhead (buffer allocation, shader compilation) across batches
  • Different memory access patterns under concurrent load

Without batched benchmarks, we may be underestimating GPU's real-world advantage (or missing optimization opportunities).

Current State

All existing benchmarks run a single MSM computation per iteration:

  • benches/e2e.rs — Criterion benchmark, one MSM per sample
  • tests/cuzk/e2e.rs — end-to-end test, single MSM execution

There is no batched or concurrent MSM benchmarking anywhere in the codebase.

Proposed Work

1. Batched MSM Benchmarks

Add benchmarks that measure throughput when running multiple MSMs concurrently:

  • Varying batch sizes: e.g., 1, 4, 16, 64, 256, 1024, 2048 concurrent MSMs
  • Varying MSM sizes within batches: e.g., batches of 2^14 MSMs (common in real provers)
  • Metrics: total wall-clock time, throughput (MSMs/sec), per-MSM latency under load

2. Batched MSM API (stretch)

Consider whether a dedicated batched MSM API could improve performance by:

  • Sharing Metal command buffers across MSMs in a batch
  • Pipelining GPU work (overlap data transfer with computation)
  • Reusing allocated buffers across MSMs of the same size

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions