Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions .github/workflows/claude-security-review.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: Claude Security Review

on:
pull_request:
types: [opened, synchronize, reopened]

permissions:
contents: read
pull-requests: write

jobs:
security-review:
# Only run after maintainer approval for external PRs
if: github.event.pull_request.draft == false
runs-on: ubuntu-latest
timeout-minutes: 30

steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Run Claude Security Review
uses: anthropics/claude-code-security-review@main
with:
claude-api-key: ${{ secrets.CLAUDE_API_KEY }}
claude-model: claude-sonnet-4-6
claudecode-timeout: 25
run-every-commit: true
exclude-directories: "docs,tests,benches,.github"
35 changes: 35 additions & 0 deletions docs/plans/2026-02-23-greth-critical-fixes-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# GRETH CRITICAL Fixes Design

Date: 2026-02-23

## GRETH-001: RPC Signer Recovery Address::ZERO Fallback

**Problem:** Gravity replaced upstream's `InvalidTransactionSignature` error with `unwrap_or(Address::ZERO)`. Any transaction with an invalid signature is silently attributed to the zero address. Block explorers, indexers, and wallets display wrong sender. Both `transaction.rs` and `receipt.rs` affected.

**Fix:** Restore original error path: `try_into_recovered_unchecked().map_err(|_| EthApiError::InvalidTransactionSignature)?`. For meta-transactions (system transactions with empty signatures), add explicit type-check that returns `SYSTEM_CALLER` instead of zero address.

**Files:** `crates/rpc/rpc-eth-api/src/helpers/transaction.rs`, `crates/rpc/rpc-eth-api/src/helpers/receipt.rs`

## GRETH-002: RocksDB Cursor Unsafe Lifetime Transmute

**Problem:** `DBRawIterator<'_>` transmuted to `'static` to store in struct alongside `Arc<DB>`. The SAFETY comment claims Arc<DB> outlives iterator, but Rust doesn't guarantee field drop order. If `db` drops before `iterator`, use-after-free occurs.

**Fix:** Restructure cursor to use `CursorInner` that holds `_db: Arc<DB>` and `iterator: DBRawIterator<'static>` together with explicit field ordering. The `iterator` field is listed before `_db` so Rust drops it first (fields drop in declaration order). Add documentation explaining the safety invariant.

**Files:** `crates/storage/db/src/implementation/rocksdb/cursor.rs`

## GRETH-003: Mint Precompile Authorized Address Mismatch

**Problem:** `AUTHORIZED_CALLER` is `0x595475934ed7d9faa7fca28341c2ce583904a44e` — an unknown EOA. The actual JWK Manager system address is `0x00000000000000000000000000000001625f4001` (defined in `onchain_config/mod.rs` as `JWK_MANAGER_ADDR`). Neither matches the comment's "0x2018". Anyone with the private key to the hardcoded address can mint unlimited G tokens.

**Fix:** Replace the hardcoded constant with `use super::onchain_config::JWK_MANAGER_ADDR; pub const AUTHORIZED_CALLER: Address = JWK_MANAGER_ADDR;`. Add compile-time assertion: `const _: () = assert!(AUTHORIZED_CALLER.0 == JWK_MANAGER_ADDR.0);` to prevent future divergence.

**Files:** `crates/pipe-exec-layer-ext-v2/execute/src/mint_precompile.rs`

## GRETH-004: Block Validation Gated on debug_assertions

**Problem:** `validate_block()` in `make_executed_block_canonical()` is wrapped in `#[cfg(debug_assertions)]`. In release builds (`--release`), the entire validation call is compiled out. Blocks are accepted as canonical without header hash verification, parent hash linkage, timestamp monotonicity, or gas limit checks.

**Fix:** Remove the `#[cfg(debug_assertions)]` attribute. Validation runs in all build configurations. Change `panic!` on failure to `error!` log + proper error propagation to avoid crashing the node on validation failure.

**Files:** `crates/engine/tree/src/tree/mod.rs`
83 changes: 83 additions & 0 deletions docs/plans/2026-02-23-greth-high-fixes-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# GRETH HIGH Fixes Design

Date: 2026-02-23

## GRETH-005: Parallel Persistence Non-Atomic Multi-DB Commit

**Problem:** Persistence writes to three independent RocksDB instances (`state_db`, `account_trie_db`, `storage_trie_db`) in parallel. If one write fails, state is partially committed with no rollback mechanism.

**Fix (Mitigation):** Add explicit error logging on partial write failure. Ensure checkpoint advancement is atomic — only written after all three DB writes succeed. Design is idempotent: recovery re-applies all writes from the checkpoint. Full 2-phase commit deferred to future refactor.

**Files:** `crates/engine/tree/src/persistence.rs`

## GRETH-006: Path Traversal in RocksDB Shard Directory Config

**Problem:** `--db.sharding-directories` accepts arbitrary paths including `../../etc/cron.daily`. No normalization or bounds checking.

**Fix:** Validate shard paths at CLI parse time. Require absolute paths. Reject any path component that is `..` (ParentDir). Log validated paths at startup for operator visibility.

**Files:** `crates/node/core/src/args/database.rs`, `crates/storage/db/src/implementation/rocksdb/mod.rs`

## GRETH-007: Grevm State Root Unverified After Parallel Execution

**Problem:** After Grevm parallel execution, the computed post-execution state root is not compared against the block header's `state_root`. If Grevm has a parallel dependency detection bug, the node silently diverges.

**Fix:** Add `warn!` logging when state root is `None` (not computed). Improve assertion messages for mismatch cases so operators can detect divergence.

**Files:** `crates/ethereum/evm/src/parallel_execute.rs`

## GRETH-008: Crash Recovery Trusts Unverified State (Design Decision)

**Problem:** Recovery re-hashes and rebuilds trie from persisted state without verifying against canonical block header state root.

**Decision:** Documented as acceptable under BFT consensus model. AptosBFT guarantees that all blocks delivered to execution layer are final and valid. Adding state root verification on recovery would require full state root computation which is expensive. Added documentation explaining the trust model.

**Files:** `crates/engine/tree/src/recovery.rs`

## GRETH-009: Immediate Finalization Without Separate Signal (Design Decision)

**Problem:** Every block is immediately marked `safe` AND `finalized` with no separate finalization signal from consensus.

**Decision:** Documented as intentional. AptosBFT provides deterministic finality — a block delivered via `push_ordered_block()` has already achieved 2f+1 consensus votes. Separating canonical/finalized would add complexity without security benefit under this consensus model.

**Files:** `crates/engine/tree/src/tree/mod.rs`

## GRETH-010: Oracle Events Extracted from User Transaction Receipts

**Problem:** Oracle events extracted from ALL execution receipts including user transactions. If NativeOracle's `SYSTEM_CALLER` access control is bypassed, user contracts could inject false oracle data.

**Fix:** Changed event extraction to process only system transaction receipts (receipts at the end of the receipt list, corresponding to system transactions appended after user transactions).

**Files:** `crates/pipe-exec-layer-ext-v2/execute/src/lib.rs`

## GRETH-011: Relayer Trusts RPC Log Filtering Without Verification

**Problem:** Relayer uses `eth_getLogs` with address/topic filters but does not re-verify the returned logs locally. A compromised RPC endpoint could inject fake bridge events.

**Fix (Two-part):**
1. Added local re-verification of `topic[0]` (event signature) and `address` on every returned log.
2. Added receipt proof cross-verification: for each log, fetch `eth_getBlockReceipts` and verify the log exists in the receipts by matching address, topics, and data. Logs that fail verification are dropped (fail-closed).

**Files:** `crates/pipe-exec-layer-ext-v2/relayer/src/blockchain_source.rs`, `crates/pipe-exec-layer-ext-v2/relayer/src/eth_client.rs`

**Review Comments** reviewer: neko; state: rejected; comments: Currently verify_logs_against_receipts uses self.rpc_client.get_block_receipts() to cross-verify logs returned by eth_getLogs. However, both RPC calls go through the same rpc_client (i.e., the same RPC endpoint). If the RPC node is fully compromised, an attacker can forge both eth_getLogs and eth_getBlockReceipts responses to be consistent with each other, rendering the cross-verification ineffective.

## GRETH-012: Relayer Has No Reorg Detection

**Problem:** Relayer polls up to "finalized" block from RPC but does not verify the block hash remains stable across polls. A source chain reorg past finality could cause already-relayed events to be undone.

**Fix:** Cache block hash on each poll. On next poll, verify the cached block hash still matches. If mismatch detected, halt relayer and emit critical error alert.

**Files:** `crates/pipe-exec-layer-ext-v2/relayer/src/blockchain_source.rs`

**Review Comments** reviewer: neko; state: rejected; comments: Unnecessary. The eth_getLogs requests only query up to the finalized block number, so reorg detection is not needed.

## GRETH-013: Transaction Pool Discard Unbounded

**Problem:** `discard_txs` event bus can remove unlimited transactions in a single message. Any crate with event bus access can drain the entire mempool.

**Fix:** Added `MAX_DISCARD_PER_BATCH = 1000` limit. Batches exceeding the limit are truncated with `warn!` logging. Each discard operation logged at warn level with transaction count and sample hashes.

**Files:** `crates/transaction-pool/src/maintain.rs`

**Review Comments** reviewer: neko; state: rejected; comments: The proposed change alters semantics. The current behavior intentionally discards all specified transactions, and truncating to MAX_DISCARD_PER_BATCH would leave stale transactions in the pool. All discard_txs must be fully processed, not partially truncated.
64 changes: 64 additions & 0 deletions docs/plans/2026-02-23-greth-medium-fixes-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# GRETH MEDIUM Fixes Design

Date: 2026-02-23

## GRETH-014: RocksDB Read-Your-Writes Limitation (Documentation)

**Problem:** `DbCursorRW::upsert()` appends writes to a `WriteBatch` buffer. `DbCursorRO::seek()` reads directly from the live RocksDB DB. A write-then-read sequence within the same transaction returns stale data, violating standard database semantics.

**Fix:** Documented the limitation clearly at the cursor implementation and all call sites. Added comments explaining that `WriteBatch` writes are only visible after `commit()`. Audited all call sites to ensure none rely on read-your-writes within uncommitted transactions.

**Files:** `crates/storage/db/src/implementation/rocksdb/cursor.rs`

**Review Comments** reviewer: neko; state: accepted; comments: Accepted.

## GRETH-015: BLS Precompile Over-Length Input Accepted

**Problem:** BLS PoP verification precompile checks `data.len() < EXPECTED_INPUT_LEN` (144 bytes) but allows arbitrarily long input. Extra trailing bytes silently ignored. Could serve as covert channel.

**Fix:** Changed to strict equality: `data.len() != EXPECTED_INPUT_LEN`. Returns `PrecompileError::Other` with descriptive message including expected and actual lengths.

**Files:** `crates/pipe-exec-layer-ext-v2/execute/src/bls_precompile.rs`

**Review Comments** reviewer: neko; state: accepted; comments: Accepted.

## GRETH-016: filter_invalid_txs Parallel Pre-Filter (Documentation)

**Problem:** `filter_invalid_txs` processes transactions per-sender in parallel using pre-block state. Cross-sender dependencies (balance transfers, shared contract state) are not visible during validation.

**Fix:** Added documentation comment clarifying this is a performance optimization filter, not a security boundary. The EVM execution definitively validates all transactions regardless of filter results. Ensured no downstream code treats the filter output as authoritative.

**Files:** `crates/pipe-exec-layer-ext-v2/execute/src/lib.rs`

**Review Comments** reviewer: neko; state: rejected; comments: The statement "each sender's account state is a local copy of the pre-block state" does not lead to invalid transactions slipping through. On the contrary, it may cause false positives — transactions that would actually be valid during EVM execution (due to cross-sender incoming transfers) are incorrectly marked as invalid by this pre-filter. The documentation comment should be corrected to: "Cross-sender dependencies (e.g. incoming transfers from other senders in the same block) are not visible during parallel per-sender validation because each sender's account state is a local copy of the pre-block state. This means the filter may produce false positives (marking valid transactions as invalid) but will not produce false negatives (letting truly invalid transactions through)."

## GRETH-017: Relayer State File No Integrity Protection

**Problem:** Relayer state (last nonce, cursor position) persisted as plain JSON with no checksum. A filesystem-level attacker can roll back `last_nonce` to cause duplicate oracle writes or advance it to skip events.

**Fix:** Added keccak256 checksum. On save, compute `keccak256(content)` and append as hex string field. On load, verify checksum matches content. Reject state file if checksum is missing or invalid, forcing fresh sync.

**Files:** `crates/pipe-exec-layer-ext-v2/relayer/src/persistence.rs`

**Review Comments** reviewer: neko; state: rejected; comments: (1) Defense against attackers is invalid — keccak256(content) is a keyless public algorithm and the code is open source. Any attacker with filesystem write access can modify last_nonce and recompute a valid checksum. Defending against a real attacker requires at minimum HMAC with a node-exclusive secret key. (2) Defense against process crashes is redundant — save() already uses a write-temp-then-rename atomic write pattern, so abnormal process exit will not produce half-written files. The checksum adds no incremental value in this scenario. (3) Defense against disk bit rot is theoretically valid but practically negligible — modern hardware ECC, mainstream filesystems (ZFS/Btrfs have data checksums; even ext4 without them), and cloud block storage end-to-end verification make the probability of "a bit flip that changes a numeric value while keeping the JSON structurally valid" approach zero.

## GRETH-018: Gravity CLI Args Accept Out-of-Range Values

**Problem:** `--gravity.pipe-block-gas-limit`, `--gravity.cache.capacity`, `--gravity.trie.parallel-level` accept any u64 value. Zero gas limit causes all executions to revert. Extremely large parallel level spawns millions of threads.

**Fix:** Added `clap::value_parser` range validators:
- `pipe-block-gas-limit`: 1,000,000..=100,000,000,000
- `cache.capacity`: 1,000..=100,000,000
- `trie.parallel-level`: 1..=64

**Files:** `crates/node/core/src/args/gravity.rs`

**Review Comments** reviewer: neko; state: accepted; comments: Accepted.

## GRETH-019: DKG Transcript Not Size-Validated

**Problem:** DKG transcript bytes from consensus layer are passed directly to system transaction construction without size or structural validation. Oversized transcript could cause excessive gas consumption or OOM.

**Fix:** Added size validation: reject if empty or exceeds 512 KB (`MAX_DKG_TRANSCRIPT_BYTES`). Returns descriptive error message with actual vs maximum size.

**Files:** `crates/pipe-exec-layer-ext-v2/execute/src/onchain_config/dkg.rs`
25 changes: 25 additions & 0 deletions docs/plans/2026-02-27-greth-high-fixes-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# GRETH HIGH Fixes Design (Round 2)

Date: 2026-02-27

## GRETH-021: validate_execution_output Only Runs in Debug Builds

**Problem:** Critical block validation logic in `execute/src/lib.rs` (L304–L336) and its call site (L420–L423) are gated behind `#[cfg(debug_assertions)]`. In production (`--release`) builds, none of these sanity checks run: (1) `gas_limit < gas_used` — block claims more gas than allowed, (2) `gas_used != last receipt cumulative_gas_used` — inconsistent gas accounting, (3) `timestamp > now_secs * 2` — timestamp in microseconds instead of seconds. This is the same class of issue as GRETH-004 (block validation bypassed in production) but in the execution layer rather than the engine tree.

**Fix:** Remove the `#[cfg(debug_assertions)]` guard. Replace `panic!` with `error!` logging + metric counter so the node signals the anomaly without crashing in production. Alternatively, keep `unwrap_or_else` panic in debug and `error!` log in release using a runtime check.

**Files:** `crates/pipe-exec-layer-ext-v2/execute/src/lib.rs`

**Review Comments** reviewer: neko; state: rejected; comments: The checks cover critical logic bugs that should only be detected in the test environment. There is no need to enable them in production builds.

## GRETH-025: System Transaction Failure Silently Continues Block Execution

**Problem:** In `transact_system_txn` (`metadata_txn.rs:L198–L203`), when a system transaction execution fails (reverts), only `log_execution_error()` is called — execution continues normally. The caller in `lib.rs:L692–L696` proceeds to commit state changes from the failed transaction via `evm.db_mut().commit(metadata_state_changes.clone())` and checks its logs for epoch-change events. A reverted metadata transaction would have empty/invalid logs, causing the epoch-change check to silently skip. Additionally, `into_executed_ordered_block_result` (`metadata_txn.rs:L123–L128`) hardcodes `success: true` in the receipt regardless of actual execution outcome, making failed system transactions indistinguishable from successful ones on-chain.

**Fix:** Either (a) assert that system transactions always succeed — `assert!(result.result.is_success(), "system txn must succeed: {result:?}")` — or (b) propagate the failure as an `Err` and halt block processing. Set `receipt.success` from `result.is_success()` instead of hardcoding `true`.

**Files:** `crates/pipe-exec-layer-ext-v2/execute/src/onchain_config/metadata_txn.rs`, `crates/pipe-exec-layer-ext-v2/execute/src/lib.rs`

**Review Comments** reviewer: neko; state: pending; comments: @lightman Fix implemented: (1) `transact_system_txn` now asserts success instead of silently logging errors — a reverted system txn indicates corrupted chain state and must halt the node. (2) Receipt `success` field now derives from `result.is_success()` instead of hardcoding `true`, making failed system txns observable on-chain. Please review whether assert-and-panic is acceptable for production, or if we should propagate `Err` and halt block processing gracefully instead.

**Review Comments** reviewer: lightman; state: partial-reject; comments: For change (a), rejected — DKG/JWK system transactions can legitimately fail, so we cannot use a hard assert. Failures must be handled gracefully (e.g., propagate `Err` and halt block processing) instead of panicking. For change (b), accepted — deriving `receipt.success` from `result.is_success()` instead of hardcoding `true` is correct.
Loading
Loading