Pure Zig implementation of Generalized XMSS signatures with wire-compatible behavior against the Rust reference implementation (leanSig). Keys, signatures, and Merkle paths interchange freely between the two ecosystems for lifetimes 2^8, 2^18, and 2^32 using SSZ encoding.
✅ Cross-Language Compatibility: All cross-language compatibility tests pass for lifetimes 2^8, 2^18, and 2^32 in both directions (Rust↔Zig) using SSZ format (ethereum_ssz on the Rust side, ssz.zig on the Zig side).
- Protocol fidelity – Poseidon2 hashing, ShakePRF domain separation, target sum encoding, and Merkle construction match the Rust reference bit-for-bit.
- Multiple lifetimes –
2^8,2^18,2^32signatures per key with configurable activation windows (defaults to 256 epochs). - Interop-first CI & tooling –
github/workflows/ci.ymlrunsbenchmark/benchmark.py, covering same-language and cross-language checks for lifetimes2^8and2^32using SSZ encoding. Locally, you can test all lifetimes (2^8,2^18,2^32) and enable verbose logs only when needed withBENCHMARK_DEBUG_LOGS=1. - Performance optimizations – Parallel tree generation, SIMD optimizations, and AVX-512 support for improved key generation performance (~7.1s for 2^32 with 1024 active epochs).
- Pure Zig – minimal dependencies, explicit memory management, ReleaseFast-ready.
- Installation
- Quick Start
- Performance Benchmarks
- AVX-512 Optimization
- Optimisations Implemented
- Development
- Cross-Platform Tests
- Debug Logging
- License
build.zig.zon:
.{
.name = "my_project",
.version = "0.1.0",
.dependencies = .{
.@"hash-zig" = .{
.url = "https://github.com/blockblaz/hash-zig/archive/refs/tags/v1.1.0.tar.gz",
.hash = "1220...", // generated by zig build
},
.@"zig-poseidon" = .{
.url = "https://github.com/blockblaz/zig-poseidon/archive/refs/heads/main.tar.gz",
.hash = "1220...", // generated by zig build
},
},
}build.zig:
const hash_zig_dep = b.dependency("hash-zig", .{ .target = target, .optimize = optimize });
const zig_poseidon_dep = b.dependency("zig_poseidon", .{ .target = target, .optimize = optimize });
exe.root_module.addImport("hash-zig", hash_zig_dep.module("hash-zig"));
exe.root_module.addImport("poseidon", zig_poseidon_dep.module("poseidon"));git clone https://github.com/blockblaz/hash-zig.git
cd hash-zig
zig build testconst std = @import("std");
const hash_zig = @import("hash-zig");
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
// Initialize scheme with desired lifetime
var scheme = try hash_zig.GeneralizedXMSSSignatureScheme.init(allocator, .lifetime_2_8);
defer scheme.deinit();
// Generate keypair (start_epoch=0, num_active_epochs=256)
const keypair = try scheme.keyGen(0, 256);
defer keypair.secret_key.deinit();
// Sign a message at epoch 0
const message = [_]u8{0x42} ** 32;
const signature = try scheme.sign(keypair.secret_key, 0, message);
defer signature.deinit();
// Verify the signature
const ok = try scheme.verify(&keypair.public_key, 0, message, signature);
std.debug.print("Signature valid: {}\n", .{ok});
}For reproducible key generation (useful for testing and cross-language compatibility):
// Generate a 32-byte seed
const seed = [_]u8{0x42} ** 32;
// Initialize scheme with seed
var scheme = try hash_zig.GeneralizedXMSSSignatureScheme.initWithSeed(
allocator,
.lifetime_2_8,
seed
);
defer scheme.deinit();
// Generate keypair (deterministic based on seed)
const keypair = try scheme.keyGen(0, 256);.lifetime_2_8- 256 signatures (fast, suitable for testing).lifetime_2_18- 262,144 signatures (production use).lifetime_2_32- 4,294,967,296 signatures (experimental, very slow keygen)
Performance Note: For lifetimes
2^18and2^32, always compile withzig build -Doptimize=ReleaseFastfor acceptable performance. Debug builds will be extremely slow for larger lifetimes.
Performance measurements are taken using ReleaseFast builds with debug logging disabled. For lifetime 2^32 with 1024 active epochs, key generation time is measured using the dedicated profiler:
zig build profile-keygen -Doptimize=ReleaseFast -Denable-profile-keygen=true -Ddebug-logs=false
Key Generation:
- Time: ~1.2 seconds
Signing Performance:
- Average: ~5-10 ms per signature
- First signature (epoch 0): ~280 ms (includes tree building)
- Subsequent signatures: ~4-6 ms
Verification Performance:
- Average: ~1-2 ms per verification
Key Generation:
- Time: ~10-20 seconds
Signing Performance:
- Average: ~10-20 ms per signature
- First signature: ~50-100 ms
- Subsequent signatures: ~10-15 ms
Verification Performance:
- Average: ~2-3 ms per verification
Key Generation:
- Time: ~916 seconds (~15.3 minutes)
Signing Performance:
- Average: ~30-50 ms per signature
- First signature: ~100-200 ms
- Subsequent signatures: ~30-40 ms
Verification Performance:
- Average: ~4-5 ms per verification
Key Generation:
- Time: ~7.1-7.4 seconds (measured with
profile-keygen, 1024 active epochs, ReleaseFast, 4-wide SIMD) - With AVX-512 (8-wide SIMD): ~3.5-4.0 seconds (expected ~2x speedup)
- Previous baseline (sequential, no optimizations): ~96.6 seconds
- Improvement vs. baseline: ~92.6% faster (~13.6x speedup with 4-wide, ~27x with 8-wide)
Performance Optimizations:
- Parallel bottom tree generation utilizes all available CPU cores
- Full SIMD Poseidon2 implementation with 4-wide (SSE4.1/NEON) and 8-wide (AVX-512) support
- Memory-aligned buffers for optimal cache performance
- Bottom tree caching for repeated key generation
- Maintains 100% Rust compatibility (same trees, same root hash)
Note: Key generation time scales roughly linearly with the number of active epochs. The optimizations significantly improve performance for larger active epoch windows.
# Lifetime 2^8 and 2^18 (default)
zig build test-lifetimes
# Lifetime 2^32 (requires flag)
zig build test-lifetimes -Denable-lifetime-2-32=true
# Test parallel tree generation (2^32 with 1024 active epochs)
zig build benchmark-parallel -Doptimize=ReleaseFastThe build script automatically detects AVX-512 support based on the target CPU features. For x86-64 systems with AVX-512 support, you can build with 8-wide SIMD for approximately 2x performance improvement.
The build script will automatically detect and use 8-wide SIMD if:
- The target architecture is x86-64
- The target CPU has AVX-512F feature enabled (e.g., when using
-mcpu=skylake-avx512)
# Build with auto-detection (will use 8-wide if AVX-512 is detected)
zig build install -Doptimize=ReleaseFast -Ddebug-logs=false
# Or explicitly specify CPU model with AVX-512 support
zig build install -Doptimize=ReleaseFast -Ddebug-logs=false --cpu skylake-avx512You can also explicitly set the SIMD width:
# Force 8-wide SIMD (AVX-512)
zig build install -Doptimize=ReleaseFast -Dsimd-width=8 -Ddebug-logs=false
# Force 4-wide SIMD (SSE4.1/NEON)
zig build install -Doptimize=ReleaseFast -Dsimd-width=4 -Ddebug-logs=falseRequirements:
- x86-64 CPU with AVX-512F support
- Zig compiler (0.14.1+)
- Build with
-Dsimd-width=8flag or specify CPU model with AVX-512 support
Performance Impact:
- 4-wide SIMD (default): ~7.1-7.4s for 2^32 (1024 epochs)
- 8-wide SIMD (AVX-512): Expected ~3.5-4.0s for 2^32 (1024 epochs) - ~2x speedup
Note: On ARM/Apple Silicon, only 4-wide SIMD is available (8-wide not supported). The build will automatically use 4-wide in this case.
This section provides a summary of optimizations implemented in the Zig implementation compared to the Rust reference implementation.
| Optimization | Rust | Zig | Status | Impact on 2^32 | Compatibility |
|---|---|---|---|---|---|
| 1. Parallel Bottom Tree Generation | ✅ into_par_iter() |
✅ std.Thread |
MATCHED | High (46.5% improvement) | ✅ Safe |
| 2. SIMD Chain Computation | ✅ PackedF (PR #5) |
✅ Full SIMD Poseidon2 implemented and enabled | MATCHED | Very High (69% improvement achieved) | ✅ Safe |
| 3. Parallel Top Tree Building | ✅ par_chunks_exact(2) |
✅ processPairsInParallel |
MATCHED | Medium | ✅ Safe |
| 4. Parallel Leaf Computation | ✅ par_chunks_exact(width) |
✅ std.Thread |
MATCHED | High | ✅ Safe |
| 5. Bottom Tree Caching | ❌ No | ✅ HashMap | AHEAD | Medium (repeated runs) | ✅ Safe |
| 6. Batch Hash Operations | ✅ Via SIMD | ✅ Batch of 4 | PARTIAL | Low-Medium | ✅ Safe |
| 7. Signing Tree Reuse | ✅ Yes | ✅ Yes | MATCHED | N/A (signing) | ✅ Safe |
Current Performance (2^32, 1024 epochs):
- Rust: ~2.0-3.2s
- Zig (4-wide SIMD): ~7.1-7.4s (measured with
profile-keygen, ReleaseFast) - Zig (8-wide SIMD, AVX-512): ~3.5-4.0s (expected, ~2x speedup)
- Gap: ~2.2-3.6x slower with 4-wide, ~1.1-1.6x slower with 8-wide (down from ~18x)
- Note: Full SIMD Poseidon2 is implemented and enabled, plus bottom-tree caching, parallel tree generation, and memory alignment optimizations
Current Performance (2^32, 256 epochs) - ✅ VERIFIED:
- Rust: 2.000s
- Zig: 1.316s (Zig faster in this case)
- Gap: Zig is faster (thread-level parallelism working well)
- Status: All cross-language compatibility tests pass ✅
Performance Notes:
- With AVX-512 support, Zig performance approaches Rust performance (~1.1-1.6x gap vs ~2.2-3.6x with 4-wide SIMD)
- Further optimizations may close the remaining gap, particularly for systems without AVX-512 support
# Build library and helper binaries
zig build
zig build install -Doptimize=ReleaseFast # Release build
# Run tests
zig build test # unit + integration tests
zig build test-lifetimes # lifetime-specific tests (2^8, 2^18)
zig build test-lifetimes -Denable-lifetime-2-32=true # include 2^32
# Format and lint
zig build lint # format check
# Build Rust benchmark tools
cd benchmark/rust_benchmark
cargo build --release --bin cross_lang_rust_tool
cargo build --release --features debug-tools --bin remote_hashsig_tool # optionalThe repository includes GitHub Actions workflows that automatically exercise cross-platform builds and cross-language compatibility on every push and pull request:
- Linux (ubuntu-latest):
- Lint:
zig build lint - Build and install library:
zig build install -Doptimize=ReleaseFast -Ddebug-logs=false - Cross-language suite:
python3 benchmark/benchmark.py --lifetime "2^8,2^32"(Rust ↔ Zig for both lifetimes)
- Lint:
- macOS + Windows (CI job
cross-platform-build):- Runs
zig buildonmacos-latestandwindows-latestto verify that the library and examples compile on all three major platforms.
- Runs
- Linux / macOS:
cd hash-zig
zig build lint
zig build install -Doptimize=ReleaseFast -Ddebug-logs=false
# Run SSZ cross-language compatibility tests
python3 benchmark/benchmark.py --lifetime "2^8,2^18,2^32"- Windows (PowerShell):
cd hash-zig
zig buildWhen contributing changes that may affect portability, ensure that zig build succeeds on your target platforms, and use the benchmark script on at least one platform to confirm cross-language compatibility.
src/ # Core library
core/ # Field arithmetic, parameters, security levels
hash/ # Hash functions (Poseidon2, SHA3, tweakable hash)
poseidon2_hash_simd.zig # SIMD-optimized Poseidon2 implementation
poseidon2/ # Poseidon2 field and permutation
prf/ # PRF implementations (ShakePRF, ChaCha12 RNG)
encoding/ # Incomparable encoding
wots/ # Winternitz OTS implementation
merkle/ # Merkle tree implementations
signature/ # Generalized XMSS signature scheme
native/ # Core scheme logic
scheme.zig # Main signature scheme implementation
simd_utils.zig # SIMD utilities and helpers
simd_cpu.zig # CPU feature detection
serialization.zig # Key/signature serialization
utils/ # Utilities (logging, memory pool)
root.zig # Public API exports
examples/ # Usage examples and demos
benchmark/ # Cross-language testing tools
benchmark.py # Main cross-language test script
rust_benchmark/ # Rust compatibility tools
zig_benchmark/ # Zig compatibility tools
scripts/ # Benchmark scripts for specific lifetimes
docs/ # Documentation (optimization analysis, etc.)
.github/ # CI workflows
Debug logging is controlled by a build-time flag -Ddebug-logs (defaults to false). When disabled, there is zero performance overhead.
# Default: debug logs disabled (optimal performance)
zig build test-lifetimes -OReleaseFast
# Enable debug logs for debugging
zig build test-lifetimes -OReleaseFast -Ddebug-logs=trueZIG_SIGN_DEBUG: Signing operation logsZIG_VERIFY_DEBUG: Verification operation logsZIG_HASH_CALL: Hash function call logsZIG_BUILDTREE: Tree building logsDEBUG:: General debug messages
- Disabled (default): Zero overhead, compiler optimizes away logging code
- Enabled: Full logging with performance impact from I/O operations
Best Practice: Always use default (disabled) for benchmarking. Enable only when debugging.
Licensed under the Apache License 2.0 – see LICENSE.
- leanSig — original Rust implementation and reference tests
- zig-poseidon — Poseidon2 over the KoalaBear field
- Generalized XMSS (ePrint 2025/055) — scheme specification
- Rust ↔ Zig compatibility investigation