perf(levm): add AVX256 implementation of BLAKE2 #3590

iovoid · 2025-07-10T16:07:46Z

Motivation

To improve BLAKE2 performance.

Description

Why AVX256 instead of AVX512? Mainly that AVX512 intrinsics are still experimental.

Creates a common/crypto module to house blake2. We should consider moving here other cryptographic operations currently inside precompiles.rs.

If avx2 is available, a permute-with-gather implementation is used.

Usage of unsafe is required for SIMD loads and stores. It should be reviewed that alignment requirements are satisfied and that no out-of-bounds operations are possible.

Note that aside from the obvious ones with "load" or "store" in the name, gather also represents a series of memory loads.

Unsafe is also required to call the first avx2-enabled function, since we must first ensure avx2 is actually available on the target CPU.

** Benchmarks **

PR

Title	Max (MGas/s)	p50 (MGas/s)	p95 (MGas/s)	p99 (MGas/s)	Min (MGas/s)
Blake1MRounds	120.19	93.97	93.38	99.85	91.54
Blake1Round	226.42	175.09	170.08	166.83	166.82
Blake1KRounds	122.36	97.28	96.09	100.90	95.87
Blake10MRounds	174.36	110.78	104.15	124.33	103.89

Main

Title	Max (MGas/s)	p50 (MGas/s)	p95 (MGas/s)	p99 (MGas/s)	Min (MGas/s)
Blake1MRounds	80.79	63.04	62.57	67.80	62.50
Blake1Round	223.59	174.93	168.21	159.38	159.33
Blake1KRounds	83.75	66.59	65.88	68.37	64.76
Blake10MRounds	117.79	77.21	69.63	83.19	69.05

github-actions · 2025-07-10T16:09:54Z

Lines of code report

Total lines added: 297
Total lines removed: 79
Total lines changed: 376

Detailed view

+-------------------------------------------------+-------+------+
| File                                            | Lines | Diff |
+-------------------------------------------------+-------+------+
| ethrex/crates/common/crypto/blake2f/avx.rs      | 169   | +169 |
+-------------------------------------------------+-------+------+
| ethrex/crates/common/crypto/blake2f/mod.rs      | 21    | +21  |
+-------------------------------------------------+-------+------+
| ethrex/crates/common/crypto/blake2f/portable.rs | 106   | +106 |
+-------------------------------------------------+-------+------+
| ethrex/crates/common/crypto/lib.rs              | 1     | +1   |
+-------------------------------------------------+-------+------+
| ethrex/crates/vm/levm/src/precompiles.rs        | 1007  | -79  |
+-------------------------------------------------+-------+------+

github-actions · 2025-07-10T16:25:42Z

No significant difference was registered for any benchmark run.

Detailed Results

Benchmark Results: BubbleSort

Command	Mean [s]	Min [s]	Max [s]	Relative
`main_revm_BubbleSort`	3.190 ± 0.023	3.168	3.239	1.00
`main_levm_BubbleSort`	4.477 ± 0.081	4.405	4.597	1.40 ± 0.03
`pr_revm_BubbleSort`	3.224 ± 0.017	3.202	3.252	1.01 ± 0.01
`pr_levm_BubbleSort`	4.489 ± 0.032	4.466	4.575	1.41 ± 0.01

Benchmark Results: ERC20Approval

Command	Mean [s]	Min [s]	Max [s]	Relative
`main_revm_ERC20Approval`	1.036 ± 0.008	1.025	1.050	1.01 ± 0.01
`main_levm_ERC20Approval`	1.536 ± 0.017	1.516	1.581	1.49 ± 0.02
`pr_revm_ERC20Approval`	1.028 ± 0.011	1.017	1.048	1.00
`pr_levm_ERC20Approval`	1.536 ± 0.005	1.528	1.543	1.49 ± 0.02

Benchmark Results: ERC20Mint

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ERC20Mint`	136.9 ± 0.7	136.2	138.2	1.00 ± 0.01
`main_levm_ERC20Mint`	257.0 ± 19.7	249.0	313.0	1.88 ± 0.14
`pr_revm_ERC20Mint`	136.8 ± 0.6	135.9	138.0	1.00
`pr_levm_ERC20Mint`	256.8 ± 2.9	254.1	264.4	1.88 ± 0.02

Benchmark Results: ERC20Transfer

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ERC20Transfer`	246.0 ± 11.0	239.2	270.6	1.03 ± 0.05
`main_levm_ERC20Transfer`	394.3 ± 2.5	392.1	400.6	1.65 ± 0.01
`pr_revm_ERC20Transfer`	238.8 ± 1.1	237.8	241.2	1.00
`pr_levm_ERC20Transfer`	405.6 ± 3.4	402.1	414.0	1.70 ± 0.02

Benchmark Results: Factorial

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Factorial`	232.0 ± 0.6	231.2	233.2	1.00
`main_levm_Factorial`	486.8 ± 24.5	473.8	534.8	2.10 ± 0.11
`pr_revm_Factorial`	234.3 ± 0.3	233.6	234.6	1.01 ± 0.00
`pr_levm_Factorial`	477.4 ± 0.7	476.3	478.3	2.06 ± 0.01

Benchmark Results: FactorialRecursive

Command	Mean [s]	Min [s]	Max [s]	Relative
`main_revm_FactorialRecursive`	1.648 ± 0.019	1.618	1.672	1.01 ± 0.04
`main_levm_FactorialRecursive`	2.879 ± 0.060	2.828	3.012	1.77 ± 0.07
`pr_revm_FactorialRecursive`	1.625 ± 0.058	1.465	1.663	1.00
`pr_levm_FactorialRecursive`	2.783 ± 0.025	2.740	2.834	1.71 ± 0.06

Benchmark Results: Fibonacci

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Fibonacci`	206.2 ± 1.5	202.8	207.9	1.00
`main_levm_Fibonacci`	465.1 ± 2.7	462.7	472.1	2.26 ± 0.02
`pr_revm_Fibonacci`	209.0 ± 0.7	207.3	209.9	1.01 ± 0.01
`pr_levm_Fibonacci`	467.2 ± 5.2	463.9	480.5	2.27 ± 0.03

Benchmark Results: FibonacciRecursive

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_FibonacciRecursive`	887.4 ± 15.0	868.0	916.8	1.00 ± 0.02
`main_levm_FibonacciRecursive`	1513.5 ± 34.5	1481.0	1564.6	1.71 ± 0.05
`pr_revm_FibonacciRecursive`	883.7 ± 15.4	848.8	907.2	1.00
`pr_levm_FibonacciRecursive`	1442.7 ± 40.6	1416.9	1555.9	1.63 ± 0.05

Benchmark Results: ManyHashes

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ManyHashes`	8.7 ± 0.1	8.6	8.8	1.00
`main_levm_ManyHashes`	13.3 ± 0.1	13.2	13.4	1.53 ± 0.01
`pr_revm_ManyHashes`	8.8 ± 0.1	8.7	8.9	1.01 ± 0.01
`pr_levm_ManyHashes`	13.4 ± 0.1	13.2	13.6	1.53 ± 0.02

Benchmark Results: MstoreBench

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_MstoreBench`	268.3 ± 3.0	265.9	276.0	1.00 ± 0.01
`main_levm_MstoreBench`	941.3 ± 3.7	936.4	948.4	3.51 ± 0.02
`pr_revm_MstoreBench`	268.1 ± 1.5	266.3	270.6	1.00
`pr_levm_MstoreBench`	936.1 ± 4.0	931.4	942.8	3.49 ± 0.02

Benchmark Results: Push

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Push`	297.9 ± 1.7	295.9	300.6	1.00
`main_levm_Push`	1058.5 ± 8.7	1054.1	1082.9	3.55 ± 0.04
`pr_revm_Push`	302.6 ± 3.0	300.4	310.7	1.02 ± 0.01
`pr_levm_Push`	1045.4 ± 3.2	1040.7	1052.4	3.51 ± 0.02

github-actions · 2025-07-10T18:13:16Z

No significant difference was registered for any benchmark run.

Detailed Results

Benchmark Results: BubbleSort

Command	Mean [s]	Min [s]	Max [s]	Relative
`main_revm_BubbleSort`	3.180 ± 0.019	3.155	3.207	1.00
`main_levm_BubbleSort`	4.552 ± 0.145	4.401	4.868	1.43 ± 0.05
`pr_revm_BubbleSort`	3.242 ± 0.018	3.220	3.277	1.02 ± 0.01
`pr_levm_BubbleSort`	4.454 ± 0.027	4.432	4.509	1.40 ± 0.01

Benchmark Results: ERC20Approval

Command	Mean [s]	Min [s]	Max [s]	Relative
`main_revm_ERC20Approval`	1.020 ± 0.004	1.016	1.030	1.00
`main_levm_ERC20Approval`	1.532 ± 0.010	1.519	1.547	1.50 ± 0.01
`pr_revm_ERC20Approval`	1.058 ± 0.010	1.049	1.076	1.04 ± 0.01
`pr_levm_ERC20Approval`	1.541 ± 0.020	1.523	1.585	1.51 ± 0.02

Benchmark Results: ERC20Mint

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ERC20Mint`	136.9 ± 1.3	135.8	140.3	1.00
`main_levm_ERC20Mint`	251.5 ± 7.6	246.1	267.7	1.84 ± 0.06
`pr_revm_ERC20Mint`	140.4 ± 0.7	139.5	142.0	1.03 ± 0.01
`pr_levm_ERC20Mint`	249.6 ± 2.6	246.9	254.9	1.82 ± 0.03

Benchmark Results: ERC20Transfer

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ERC20Transfer`	240.1 ± 1.3	238.7	242.5	1.00
`main_levm_ERC20Transfer`	394.6 ± 2.0	392.0	397.7	1.64 ± 0.01
`pr_revm_ERC20Transfer`	246.9 ± 2.2	245.1	252.3	1.03 ± 0.01
`pr_levm_ERC20Transfer`	395.6 ± 4.4	391.5	403.6	1.65 ± 0.02

Benchmark Results: Factorial

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Factorial`	230.8 ± 1.2	229.9	234.0	1.00
`main_levm_Factorial`	489.2 ± 23.4	476.0	534.5	2.12 ± 0.10
`pr_revm_Factorial`	233.1 ± 2.5	230.6	238.8	1.01 ± 0.01
`pr_levm_Factorial`	478.4 ± 3.9	475.0	488.8	2.07 ± 0.02

Benchmark Results: FactorialRecursive

Command	Mean [s]	Min [s]	Max [s]	Relative
`main_revm_FactorialRecursive`	1.620 ± 0.020	1.595	1.649	1.01 ± 0.02
`main_levm_FactorialRecursive`	2.862 ± 0.064	2.775	2.942	1.78 ± 0.05
`pr_revm_FactorialRecursive`	1.604 ± 0.025	1.562	1.646	1.00
`pr_levm_FactorialRecursive`	2.781 ± 0.031	2.724	2.826	1.73 ± 0.03

Benchmark Results: Fibonacci

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Fibonacci`	205.5 ± 0.7	204.7	206.8	1.00
`main_levm_Fibonacci`	469.6 ± 6.5	463.6	481.4	2.29 ± 0.03
`pr_revm_Fibonacci`	207.4 ± 1.1	205.7	209.0	1.01 ± 0.01
`pr_levm_Fibonacci`	468.1 ± 6.2	462.3	483.9	2.28 ± 0.03

Benchmark Results: FibonacciRecursive

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_FibonacciRecursive`	860.3 ± 14.1	830.9	880.6	1.00
`main_levm_FibonacciRecursive`	1468.7 ± 25.9	1448.2	1539.3	1.71 ± 0.04
`pr_revm_FibonacciRecursive`	869.8 ± 13.8	848.5	888.2	1.01 ± 0.02
`pr_levm_FibonacciRecursive`	1451.8 ± 9.2	1443.1	1468.8	1.69 ± 0.03

Benchmark Results: ManyHashes

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ManyHashes`	8.6 ± 0.1	8.6	8.7	1.00
`main_levm_ManyHashes`	13.7 ± 0.1	13.5	13.8	1.59 ± 0.02
`pr_revm_ManyHashes`	8.8 ± 0.0	8.8	8.8	1.02 ± 0.01
`pr_levm_ManyHashes`	13.4 ± 0.1	13.3	13.5	1.55 ± 0.01

Benchmark Results: MstoreBench

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_MstoreBench`	274.3 ± 4.1	268.7	280.8	1.00
`main_levm_MstoreBench`	939.1 ± 5.7	931.5	950.4	3.42 ± 0.06
`pr_revm_MstoreBench`	282.9 ± 3.5	276.7	287.2	1.03 ± 0.02
`pr_levm_MstoreBench`	941.4 ± 4.8	936.3	951.0	3.43 ± 0.05

Benchmark Results: Push

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Push`	300.5 ± 2.4	297.7	303.6	1.01 ± 0.01
`main_levm_Push`	1053.6 ± 7.4	1047.6	1073.5	3.54 ± 0.03
`pr_revm_Push`	297.5 ± 1.2	296.0	300.4	1.00
`pr_levm_Push`	1061.8 ± 6.9	1049.2	1071.4	3.57 ± 0.03

github-actions · 2025-07-10T18:38:59Z

Benchmark for `ea7c34b`

Click to view benchmark

Test	Base	PR	%
block payload building bench	0.2±0.00ns	0.2±0.00ns	0.00%

github-actions · 2025-07-10T18:45:44Z

No significant difference was registered for any benchmark run.

Detailed Results

Benchmark Results: BubbleSort

Command	Mean [s]	Min [s]	Max [s]	Relative
`main_revm_BubbleSort`	3.192 ± 0.023	3.166	3.242	1.00
`main_levm_BubbleSort`	4.427 ± 0.050	4.396	4.567	1.39 ± 0.02
`pr_revm_BubbleSort`	3.233 ± 0.017	3.194	3.262	1.01 ± 0.01
`pr_levm_BubbleSort`	4.502 ± 0.169	4.422	4.980	1.41 ± 0.05

Benchmark Results: ERC20Approval

Command	Mean [s]	Min [s]	Max [s]	Relative
`main_revm_ERC20Approval`	1.035 ± 0.008	1.029	1.056	1.00
`main_levm_ERC20Approval`	1.528 ± 0.038	1.506	1.630	1.48 ± 0.04
`pr_revm_ERC20Approval`	1.048 ± 0.009	1.038	1.069	1.01 ± 0.01
`pr_levm_ERC20Approval`	1.524 ± 0.016	1.511	1.556	1.47 ± 0.02

Benchmark Results: ERC20Mint

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ERC20Mint`	137.7 ± 1.1	136.9	140.6	1.00
`main_levm_ERC20Mint`	249.4 ± 5.8	243.9	263.1	1.81 ± 0.04
`pr_revm_ERC20Mint`	138.4 ± 0.5	137.8	139.3	1.00 ± 0.01
`pr_levm_ERC20Mint`	247.8 ± 0.6	247.0	248.4	1.80 ± 0.02

Benchmark Results: ERC20Transfer

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ERC20Transfer`	243.7 ± 2.6	242.2	250.8	1.00
`main_levm_ERC20Transfer`	391.8 ± 3.7	389.0	400.5	1.61 ± 0.02
`pr_revm_ERC20Transfer`	244.0 ± 3.9	241.1	254.7	1.00 ± 0.02
`pr_levm_ERC20Transfer`	392.1 ± 3.1	387.6	398.2	1.61 ± 0.02

Benchmark Results: Factorial

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Factorial`	235.2 ± 5.2	232.3	249.5	1.00 ± 0.02
`main_levm_Factorial`	497.1 ± 27.4	474.1	529.4	2.12 ± 0.12
`pr_revm_Factorial`	234.6 ± 1.4	233.4	236.8	1.00
`pr_levm_Factorial`	478.4 ± 7.6	475.1	499.9	2.04 ± 0.03

Benchmark Results: FactorialRecursive

Command	Mean [s]	Min [s]	Max [s]	Relative
`main_revm_FactorialRecursive`	1.601 ± 0.025	1.568	1.657	1.00 ± 0.03
`main_levm_FactorialRecursive`	2.773 ± 0.060	2.714	2.889	1.73 ± 0.05
`pr_revm_FactorialRecursive`	1.601 ± 0.033	1.542	1.651	1.00
`pr_levm_FactorialRecursive`	2.753 ± 0.018	2.731	2.786	1.72 ± 0.04

Benchmark Results: Fibonacci

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Fibonacci`	207.1 ± 1.3	206.1	210.8	1.00
`main_levm_Fibonacci`	474.3 ± 36.5	461.3	578.1	2.29 ± 0.18
`pr_revm_Fibonacci`	216.0 ± 2.8	214.0	223.2	1.04 ± 0.01
`pr_levm_Fibonacci`	468.4 ± 6.9	463.5	481.4	2.26 ± 0.04

Benchmark Results: FibonacciRecursive

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_FibonacciRecursive`	866.2 ± 5.7	858.7	874.5	1.01 ± 0.01
`main_levm_FibonacciRecursive`	1465.6 ± 36.5	1433.8	1534.2	1.70 ± 0.05
`pr_revm_FibonacciRecursive`	861.0 ± 10.4	844.8	875.2	1.00
`pr_levm_FibonacciRecursive`	1450.4 ± 7.2	1439.4	1465.9	1.68 ± 0.02

Benchmark Results: ManyHashes

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ManyHashes`	8.7 ± 0.1	8.6	8.8	1.00 ± 0.01
`main_levm_ManyHashes`	13.3 ± 0.1	13.2	13.5	1.53 ± 0.01
`pr_revm_ManyHashes`	8.7 ± 0.0	8.7	8.8	1.00
`pr_levm_ManyHashes`	13.4 ± 0.1	13.3	13.8	1.54 ± 0.02

Benchmark Results: MstoreBench

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_MstoreBench`	311.8 ± 132.1	266.7	687.6	1.16 ± 0.49
`main_levm_MstoreBench`	939.0 ± 3.6	933.8	947.3	3.50 ± 0.02
`pr_revm_MstoreBench`	268.1 ± 1.2	266.9	270.2	1.00
`pr_levm_MstoreBench`	943.2 ± 6.5	933.3	955.0	3.52 ± 0.03

Benchmark Results: Push

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Push`	294.4 ± 0.9	292.6	296.0	1.00 ± 0.00
`main_levm_Push`	1072.8 ± 77.8	1044.1	1293.8	3.65 ± 0.26
`pr_revm_Push`	293.9 ± 0.8	292.3	294.8	1.00
`pr_levm_Push`	1053.8 ± 4.3	1048.2	1062.2	3.59 ± 0.02

github-actions · 2025-07-10T19:09:14Z

Benchmark for `0544811`

Click to view benchmark

Test	Base	PR	%
block payload building bench	0.2±0.00ns	0.2±0.00ns	0.00%

github-actions · 2025-07-10T19:51:20Z

Benchmark Block Execution Results Comparison Against Main

Command	Mean [s]	Min [s]	Max [s]	Relative
`base`	212.212 ± 0.969	210.866	214.028	1.00 ± 0.01
`head`	212.043 ± 0.993	210.946	213.913	1.00

github-actions · 2025-07-16T14:33:24Z

Benchmark Results Comparison

No significant difference was registered for any benchmark run.

Detailed Results

Benchmark Results: BubbleSort

Command	Mean [s]	Min [s]	Max [s]	Relative
`main_revm_BubbleSort`	3.257 ± 0.016	3.230	3.285	1.01 ± 0.01
`main_levm_BubbleSort`	4.413 ± 0.024	4.387	4.471	1.37 ± 0.01
`pr_revm_BubbleSort`	3.230 ± 0.012	3.220	3.254	1.00
`pr_levm_BubbleSort`	4.418 ± 0.022	4.393	4.475	1.37 ± 0.01

Benchmark Results: ERC20Approval

Command	Mean [s]	Min [s]	Max [s]	Relative
`main_revm_ERC20Approval`	1.046 ± 0.006	1.036	1.056	1.01 ± 0.01
`main_levm_ERC20Approval`	1.514 ± 0.018	1.499	1.552	1.47 ± 0.02
`pr_revm_ERC20Approval`	1.033 ± 0.003	1.028	1.038	1.00
`pr_levm_ERC20Approval`	1.520 ± 0.010	1.506	1.538	1.47 ± 0.01

Benchmark Results: ERC20Mint

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ERC20Mint`	139.6 ± 1.4	138.3	143.2	1.01 ± 0.01
`main_levm_ERC20Mint`	244.2 ± 2.8	240.5	247.4	1.77 ± 0.02
`pr_revm_ERC20Mint`	138.2 ± 0.8	137.0	139.2	1.00
`pr_levm_ERC20Mint`	248.7 ± 3.1	245.4	256.2	1.80 ± 0.02

Benchmark Results: ERC20Transfer

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ERC20Transfer`	245.4 ± 0.8	244.1	246.8	1.00
`main_levm_ERC20Transfer`	389.1 ± 2.7	386.1	393.4	1.59 ± 0.01
`pr_revm_ERC20Transfer`	245.8 ± 3.5	242.5	254.5	1.00 ± 0.01
`pr_levm_ERC20Transfer`	392.2 ± 2.6	389.1	395.4	1.60 ± 0.01

Benchmark Results: Factorial

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Factorial`	238.6 ± 1.1	237.2	241.1	1.00
`main_levm_Factorial`	458.8 ± 1.1	457.0	460.5	1.92 ± 0.01
`pr_revm_Factorial`	240.9 ± 2.4	238.7	247.0	1.01 ± 0.01
`pr_levm_Factorial`	461.8 ± 2.9	459.5	469.4	1.94 ± 0.01

Benchmark Results: FactorialRecursive

Command	Mean [s]	Min [s]	Max [s]	Relative
`main_revm_FactorialRecursive`	1.572 ± 0.093	1.316	1.642	1.00
`main_levm_FactorialRecursive`	2.677 ± 0.021	2.646	2.708	1.70 ± 0.10
`pr_revm_FactorialRecursive`	1.604 ± 0.026	1.552	1.644	1.02 ± 0.06
`pr_levm_FactorialRecursive`	2.764 ± 0.020	2.738	2.805	1.76 ± 0.10

Benchmark Results: Fibonacci

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Fibonacci`	212.2 ± 0.5	211.5	213.0	1.00
`main_levm_Fibonacci`	446.6 ± 1.4	444.7	449.4	2.10 ± 0.01
`pr_revm_Fibonacci`	212.7 ± 0.7	211.5	214.0	1.00 ± 0.00
`pr_levm_Fibonacci`	450.7 ± 9.3	446.1	477.0	2.12 ± 0.04

Benchmark Results: FibonacciRecursive

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_FibonacciRecursive`	853.7 ± 12.4	837.2	875.0	1.00
`main_levm_FibonacciRecursive`	1404.7 ± 10.5	1390.2	1422.3	1.65 ± 0.03
`pr_revm_FibonacciRecursive`	857.6 ± 22.0	832.4	911.9	1.00 ± 0.03
`pr_levm_FibonacciRecursive`	1461.6 ± 8.1	1451.6	1477.6	1.71 ± 0.03

Benchmark Results: ManyHashes

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_ManyHashes`	8.8 ± 0.1	8.7	9.0	1.01 ± 0.01
`main_levm_ManyHashes`	13.1 ± 0.1	13.0	13.4	1.49 ± 0.01
`pr_revm_ManyHashes`	8.7 ± 0.0	8.7	8.8	1.00
`pr_levm_ManyHashes`	14.0 ± 0.8	13.6	16.2	1.60 ± 0.09

Benchmark Results: MstoreBench

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_MstoreBench`	276.5 ± 2.1	274.5	280.2	1.00
`main_levm_MstoreBench`	936.2 ± 2.3	932.3	939.9	3.39 ± 0.03
`pr_revm_MstoreBench`	276.7 ± 1.1	275.6	278.9	1.00 ± 0.01
`pr_levm_MstoreBench`	943.1 ± 6.4	936.3	958.9	3.41 ± 0.03

Benchmark Results: Push

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_Push`	293.8 ± 1.4	292.2	296.7	1.00
`main_levm_Push`	1052.0 ± 3.8	1046.8	1057.9	3.58 ± 0.02
`pr_revm_Push`	294.8 ± 2.3	292.9	300.7	1.00 ± 0.01
`pr_levm_Push`	1060.6 ± 5.9	1056.3	1076.1	3.61 ± 0.03

Benchmark Results: SstoreBench_no_opt

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`main_revm_SstoreBench_no_opt`	158.8 ± 0.7	157.6	159.7	1.00
`main_levm_SstoreBench_no_opt`	170.5 ± 2.1	168.8	174.0	1.07 ± 0.01
`pr_revm_SstoreBench_no_opt`	161.2 ± 5.2	157.5	170.8	1.02 ± 0.03
`pr_levm_SstoreBench_no_opt`	171.1 ± 1.9	168.5	173.9	1.08 ± 0.01

Arkenan

LGTM

crates/common/crypto/blake2f/avx.rs

edg-l · 2025-07-17T05:34:55Z

Looks good from my side, additionally i ran this code under miri to check if it found anything, adding some more tests and it ran ok.

RUSTFLAGS="-Zrandomize-layout -C target-cpu=x86-64-v4 -C target-feature=+avx2" cargo miri test

**Motivation** To improve BLAKE2 performance. **Description** Why AVX256 instead of AVX512? Mainly that [AVX512](rust-lang/rust#111137) intrinsics are still experimental. Creates a common/crypto module to house blake2. We should consider moving here other cryptographic operations currently inside precompiles.rs. If avx2 is available, a permute-with-gather implementation is used. Usage of unsafe is required for SIMD loads and stores. It should be reviewed that alignment requirements are satisfied and that no out-of-bounds operations are possible. Note that aside from the obvious ones with "load" or "store" in the name, gather also represents a series of memory loads. Unsafe is also required to call the first avx2-enabled function, since we must first ensure avx2 is actually available on the target CPU. ** Benchmarks ** ### PR |Title|Max (MGas/s)|p50 (MGas/s)|p95 (MGas/s)|p99 (MGas/s)|Min (MGas/s)| |----|--------------|--------------|-------------|--------------|--------------| Blake1MRounds|120.19|93.97|93.38|99.85|91.54 Blake1Round|226.42|175.09|170.08|166.83|166.82 Blake1KRounds|122.36|97.28|96.09|100.90|95.87 Blake10MRounds|174.36|110.78|104.15|124.33|103.89 ### Main |Title|Max (MGas/s)|p50 (MGas/s)|p95 (MGas/s)|p99 (MGas/s)|Min (MGas/s)| |----|--------------|--------------|-------------|--------------|--------------| Blake1MRounds|80.79|63.04|62.57|67.80|62.50 Blake1Round|223.59|174.93|168.21|159.38|159.33 Blake1KRounds|83.75|66.59|65.88|68.37|64.76 Blake10MRounds|117.79|77.21|69.63|83.19|69.05

iovoid added 5 commits July 8, 2025 16:44

implement blake2f using avx2

8c416e2

fix initializer

a65119c

add safety annotations and tests

2ec1fed

clippy

6abaee4

remove unneeded unsafes

a4c8156

github-actions bot assigned iovoid Jul 10, 2025

github-actions bot added the performance label Jul 10, 2025

github-project-automation bot added this to ethrex_performance Jul 10, 2025

github-project-automation bot moved this to Todo in ethrex_performance Jul 10, 2025

Merge branch 'main' into perf/blake2-avx256

6640b6d

iovoid added 2 commits July 10, 2025 15:24

update changelog

ad05dc3

update cargo.lock

58486db

iovoid changed the title ~~perf(levm): AVX256 implementation of blake2~~ perf(levm): add AVX256 implementation of BLAKE2 Jul 10, 2025

iovoid added the levm Lambda EVM implementation label Jul 10, 2025

github-project-automation bot added this to ethrex_l1 Jul 10, 2025

iovoid marked this pull request as ready for review July 10, 2025 18:45

iovoid requested a review from a team as a code owner July 10, 2025 18:45

iovoid moved this to In Review in ethrex_l1 Jul 10, 2025

iovoid moved this from Todo to In review in ethrex_performance Jul 10, 2025

iovoid mentioned this pull request Jul 11, 2025

perf(levm): use inline(always) for blake2 #3518

Closed

Merge branch 'main' into perf/blake2-avx256

f826aab

Merge branch 'main' into perf/blake2-avx256

17e2f63

Arkenan approved these changes Jul 16, 2025

View reviewed changes

crates/common/crypto/blake2f/avx.rs Outdated Show resolved Hide resolved

improve comment

f73496b

edg-l approved these changes Jul 17, 2025

View reviewed changes

iovoid added 2 commits July 17, 2025 09:07

Merge branch 'main' into perf/blake2-avx256

8b51be6

Merge branch 'main' into perf/blake2-avx256

addb5e2

iovoid added this pull request to the merge queue Jul 17, 2025

Merged via the queue into main with commit 165b94c Jul 17, 2025
49 checks passed

iovoid deleted the perf/blake2-avx256 branch July 17, 2025 14:49

github-project-automation bot moved this from In review to Done in ethrex_performance Jul 17, 2025

github-project-automation bot moved this from In Review to Done in ethrex_l1 Jul 17, 2025

perf(levm): add AVX256 implementation of BLAKE2 #3590

perf(levm): add AVX256 implementation of BLAKE2 #3590

Uh oh!

Conversation

iovoid commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR

Main

Uh oh!

github-actions bot commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Lines of code report

Uh oh!

github-actions bot commented Jul 10, 2025

Benchmark Results: BubbleSort

Benchmark Results: ERC20Approval

Benchmark Results: ERC20Mint

Benchmark Results: ERC20Transfer

Benchmark Results: Factorial

Benchmark Results: FactorialRecursive

Benchmark Results: Fibonacci

Benchmark Results: FibonacciRecursive

Benchmark Results: ManyHashes

Benchmark Results: MstoreBench

Benchmark Results: Push

Uh oh!

github-actions bot commented Jul 10, 2025

Benchmark Results: BubbleSort

Benchmark Results: ERC20Approval

Benchmark Results: ERC20Mint

Benchmark Results: ERC20Transfer

Benchmark Results: Factorial

Benchmark Results: FactorialRecursive

Benchmark Results: Fibonacci

Benchmark Results: FibonacciRecursive

Benchmark Results: ManyHashes

Benchmark Results: MstoreBench

Benchmark Results: Push

Uh oh!

github-actions bot commented Jul 10, 2025

Benchmark for ea7c34b

Uh oh!

github-actions bot commented Jul 10, 2025

Benchmark Results: BubbleSort

Benchmark Results: ERC20Approval

Benchmark Results: ERC20Mint

Benchmark Results: ERC20Transfer

Benchmark Results: Factorial

Benchmark Results: FactorialRecursive

Benchmark Results: Fibonacci

Benchmark Results: FibonacciRecursive

Benchmark Results: ManyHashes

Benchmark Results: MstoreBench

Benchmark Results: Push

Uh oh!

github-actions bot commented Jul 10, 2025

Benchmark for 0544811

Uh oh!

github-actions bot commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Block Execution Results Comparison Against Main

Uh oh!

github-actions bot commented Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results Comparison

Benchmark Results: BubbleSort

Benchmark Results: ERC20Approval

Benchmark Results: ERC20Mint

Benchmark Results: ERC20Transfer

Benchmark Results: Factorial

Benchmark Results: FactorialRecursive

Benchmark Results: Fibonacci

Benchmark Results: FibonacciRecursive

Benchmark Results: ManyHashes

Benchmark Results: MstoreBench

Benchmark Results: Push

Benchmark Results: SstoreBench_no_opt

Uh oh!

Arkenan left a comment

Choose a reason for hiding this comment

iovoid commented Jul 10, 2025 •

edited

Loading

github-actions bot commented Jul 10, 2025 •

edited

Loading

Benchmark for `ea7c34b`

Benchmark for `0544811`

github-actions bot commented Jul 10, 2025 •

edited

Loading

github-actions bot commented Jul 16, 2025 •

edited

Loading