Performance benchmarks for [Chorex](https://github.com/utahplt/chorex)
bash$ iex -S mix iex> ChorexBenchmarks.stats()
You will need a recent version of Elixir (1.18 or better is best) and at least Perl 5.36 (only if re-running bench_maker.pl
) to use this.
To run the benchmarks:
- Clone this repo.
- Get the dependencies (Chorex and Benchee) with
mix deps.get
. - (Optional) Run
perl bench_maker.pl
. This will recreate thebig_chor.ex
file if needed. - Fire up
iex -S mix
. This takes several seconds as thebig_chor.ex
file seems to take a while to compile. - Run the benchmarks with
ChorexBenchmarks.stats()
. This takes about 3 minutes.
By far the most punishing benchmarks are those where there are nested recursive try
blocks. A structure like:
def loop(...) do
try do
...
loop(...)
rescue
...
end
end
is extremely punishing as the stack gets deeper and deeper. This can be seen in the Miniblock and Deep Loops benchmarks. In contrast, a structure where the try
is not involved in the recursion, like in Flat Loops, the performance is much better:
def loop(...) do
do_work(...)
loop(...)
end
def do_work(...) do
try do
...
rescue
...
end
end
Finally, try/rescue
seems to impose a negligible impact when there are a large number (100) of actors. Since these actors are specified manually, and Chorex does not yet have census polymorphism, 100 seems to be a reasonable torture test for a choreography.
Operating System: macOS CPU Information: Apple M1 Pro Number of Available Cores: 10 Available memory: 32 GB Elixir 1.18.0 Erlang 27.2 JIT enabled: true
Name | ips | average | deviation | median | 99th % |
---|---|---|---|---|---|
miniblock: without try block | 1.42 K | 0.71 ms | ±35.00% | 0.70 ms | 0.78 ms |
miniblock: with try block | 0.147 K | 6.80 ms | ±13.09% | 6.58 ms | 8.91 ms |
Comparison
Name | ips | slowdown |
---|---|---|
miniblock: without try block | 1.42 K | |
miniblock: with try block | 0.147 K | 9.65× slower +6.10 ms |
Name | ips | average | deviation | median | 99th % |
---|---|---|---|---|---|
flat loop with try, 1000 iterations | 3.80 | 263.16 ms | ±4.74% | 258.63 ms | 296.86 ms |
flat loop without try, 1000 iterations | 3.79 | 264.09 ms | ±8.35% | 273.21 ms | 285.60 ms |
Comparison
Name | ips | slowdown |
---|---|---|
flat loop with try, 1000 iterations | 3.80 | |
flat loop without try, 1000 iterations | 3.79 | 1.00× slower +0.93 ms |
Name | ips | average | deviation | median | 99th % |
---|---|---|---|---|---|
flat loop without try, 10000 iterations | 0.37 | 2.74 s | ±0.24% | 2.74 s | 2.75 s |
flat loop with try, 10000 iterations | 0.34 | 2.92 s | ±0.10% | 2.92 s | 2.93 s |
Comparison
Name | ips | slowdown |
---|---|---|
flat loop without try, 10000 iterations | 0.37 | |
flat loop with try, 10000 iterations | 0.34 | 1.07× slower +0.186 s |
Name | ips | average | deviation | median | 99th % |
---|---|---|---|---|---|
loop: no try, 100 iterations, no split work | 47.41 | 21.09 ms | ±0.85% | 21.09 ms | 21.65 ms |
loop: no try, 100 iterations, split work | 47.39 | 21.10 ms | ±0.84% | 21.10 ms | 21.59 ms |
loop: with try, 100 iterations, no split work | 41.73 | 23.96 ms | ±15.59% | 23.70 ms | 25.95 ms |
loop: with try, 100 iterations, split work | 40.54 | 24.67 ms | ±2.35% | 24.53 ms | 26.57 ms |
Comparison
Name | ||
---|---|---|
loop: no try, 100 iterations, no split work | 47.41 | |
loop: no try, 100 iterations, split work | 47.39 | 1.00× slower +0.0106 ms |
loop: with try, 100 iterations, no split work | 41.73 | 1.14× slower +2.87 ms |
loop: with try, 100 iterations, split work | 40.54 | 1.17× slower +3.57 ms |
Name | ips | average | deviation | median | 99th % |
---|---|---|---|---|---|
loop: no try, 1000 iterations, split work | 4.76 | 210.27 ms | ±0.36% | 210.17 ms | 214.66 ms |
loop: no try, 1000 iterations, no split work | 4.75 | 210.34 ms | ±0.28% | 210.34 ms | 212.55 ms |
loop: with try, 1000 iterations, no split work | 2.21 | 452.92 ms | ±9.89% | 455.50 ms | 541.35 ms |
loop: with try, 1000 iterations, split work | 2.20 | 455.05 ms | ±9.61% | 454.87 ms | 541.07 ms |
Comparison
Name | ips | slowdown |
---|---|---|
loop: no try, 1000 iterations, split work | 4.76 | |
loop: no try, 1000 iterations, no split work | 4.75 | 1.00× slower +0.0657 ms |
loop: with try, 1000 iterations, no split work | 2.21 | 2.15× slower +242.64 ms |
loop: with try, 1000 iterations, split work | 2.20 | 2.16× slower +244.77 ms |
Name | ips | average | deviation | median | 99th % |
---|---|---|---|---|---|
loop: no try, 10000 iterations, split work | 0.50 | 1.98 s | ±0.22% | 1.98 s | 1.99 s |
loop: no try, 10000 iterations, no split work | 0.50 | 1.99 s | ±0.85% | 1.98 s | 2.03 s |
loop: with try, 10000 iterations, no split work | 0.0258 | 38.83 s | ±0.00% | 38.83 s | 38.83 s |
loop: with try, 10000 iterations, split work | 0.0225 | 44.54 s | ±0.00% | 44.54 s | 44.54 s |
Comparison
Name | ips | slowdown |
---|---|---|
loop: no try, 10000 iterations, split work | 0.50 | |
loop: no try, 10000 iterations, no split work | 0.50 | 1.00× slower +0.00480 s |
loop: with try, 10000 iterations, no split work | 0.0258 | 19.57× slower +36.84 s |
loop: with try, 10000 iterations, split work | 0.0225 | 22.46× slower +42.56 s |
Name | ips | average | deviation | median | 99th % |
---|---|---|---|---|---|
state machine no try | 1.99 K | 503.74 μs | ±816.82% | 476.71 μs | 759.63 μs |
state machine with try | 1.97 K | 506.92 μs | ±35.77% | 510.54 μs | 816.90 μs |
state machine with try & recovery | 1.96 K | 509.54 μs | ±36.43% | 508.38 μs | 824.76 μs |
Comparison
Name | ips | slowdown |
---|---|---|
state machine no try | 1.99 K | |
state machine with try | 1.97 K | 1.01× slower +3.18 μs |
state machine with try & recovery | 1.96 K | 1.01× slower +5.80 μs |
Name | ips | average | deviation | median | 99th % |
---|---|---|---|---|---|
lots of actors, no try | 141.16 | 7.08 ms | ±37.15% | 6.49 ms | 18.54 ms |
lots of actors, with try | 139.98 | 7.14 ms | ±38.32% | 6.44 ms | 18.31 ms |
Comparison
Name | ips | slowdown |
---|---|---|
lots of actors, no try | 141.16 | |
lots of actors, with try | 139.98 | 1.01× slower +0.0598 ms |
Run the bench_maker.pl
script to create some big Elixir files:
perl bench_maker.pl 10 > big_chor_10.ex
perl bench_maker.pl 100 > big_chor_100.ex
perl bench_maker.pl 1000 > big_chor_1000.ex
Now compile everything and use the compile profiler to get compile times:
mix compile --force --profile time