Skip to content

(stacked) v2: flatten api/ to repo root#126

Draft
stegaBOB wants to merge 15 commits into
otter-sec:masterfrom
solana-mobile:stegaBOB/feat/v2-cleanup
Draft

(stacked) v2: flatten api/ to repo root#126
stegaBOB wants to merge 15 commits into
otter-sec:masterfrom
solana-mobile:stegaBOB/feat/v2-cleanup

Conversation

@stegaBOB
Copy link
Copy Markdown

@stegaBOB stegaBOB commented May 24, 2026

Stacked on top of #125. Pure restructure -- no behaviour change.

Layout

api/ subdirectory collapsed back to repo root, then the remaining files regrouped:

src/
├── api/          ← routes, handlers/, responses
├── build/        ← verify worker + log writer
├── onchain/      ← RPC + Otter PDA reads
├── config.rs
├── db.rs
├── errors.rs
├── main.rs
├── state.rs
├── sweep.rs
└── types.rs      (was validation.rs)

7 top-level files + 3 subdirs, down from 13 + 2 after the bare flatten.

What moved where

  • handlers/, responses.rs, routes.rsapi/
  • build.rs (worker) + logs.rs (log writer) → build/ (as mod.rs + logs.rs)
  • onchain/program_authority_retriever.rsonchain/state.rs (and program_hash_retriver.rs, a 5-line wrapper, folded in here)
  • onchain/program_metadata_retriever.rsonchain/otter.rs (it's specifically about the Otter Verify PDA, not generic chain reads)
  • validation.rstypes.rs (it's the newtype module -- Address, WebhookUrl -- not really a validation module)

Inlined / deleted

  • misc.rs split: build_repository_url moves to api/responses.rs (its callers are response builders), extract_hash_with_prefix moves into build/mod.rs as a private fn (only the build worker parses solana-verify stdout).
  • validate_pubkey / validate_http_url were only called by the corresponding newtype's FromStr; inlined.
  • validate_search had one caller; moved into that handler.

This was referenced May 24, 2026
@stegaBOB stegaBOB force-pushed the stegaBOB/feat/v2-cleanup branch 2 times, most recently from 598a4c4 to 25de2d7 Compare May 25, 2026 19:39
claude and others added 5 commits May 25, 2026 19:41
Co-Authored-By: stegaBOB <41593264+stegaBOB@users.noreply.github.com>
Leftover from v1's global Lazy pattern; v2 injects config via AppState.

Co-Authored-By: stegaBOB <41593264+stegaBOB@users.noreply.github.com>
The handler used to do two sequential queries — `get_program_state` for
the cached on-chain hash + frozen/closed flags, then `best_build` for
the matching completed build — and stitch the result through
`VerificationResponse::from_state_and_build`.

Replace both with one `LEFT JOIN LATERAL` that returns just the seven
columns the response needs, render the `ExtendedStatusResponse` inline,
and serialize once. Drops `best_build` (its only caller is gone) and
shapes the handler to send the pre-encoded body via
`(StatusCode, [(CONTENT_TYPE, "application/json")], json)` so axum
doesn't re-encode anything.

One round-trip per `/status` request, no struct→JSON shuffle, same
external response shape.

Co-Authored-By: stegaBOB <41593264+stegaBOB@users.noreply.github.com>
Adds a moka cache (`Cache<Address, String>`) of pre-serialized `/status`
bodies. Hits skip the LATERAL join and the JSON encode; misses run the
existing path and write back the rendered body.

Every mutating method invalidates the affected program's entry:
- `upsert_program_state` (sweep + post-build snapshot)
- `unverify_program` (upgrade webhook)
- `mark_closed` (close webhook + sweep)
- `mark_build_completed` (build finished — added `program_id` arg so we
  have the key without an extra lookup)

TTL is bound to the sweep interval. Every sweep cycle upserts every
`program_state` row, which evicts the matching cache entry — so a longer
TTL would never fire, and a shorter one just adds DB load between
sweeps. Capacity capped at 10k entries (LRU eviction beyond that, which
the verified-program set will not reach in any realistic timeframe).

Restart self-heals: the cache is empty on boot, the sweep job fires
immediately at startup and refreshes `program_state` from chain before
the first cached entry is written.

Co-Authored-By: stegaBOB <41593264+stegaBOB@users.noreply.github.com>
Replaces the manual get → DB query → insert pattern with `try_get_with`.
moka's recommended path for "fetch or compute" coalesces concurrent
calls on the same cold key into one computation — without it, 100
concurrent /status requests for an uncached program would each run the
LATERAL join independently before any could write back.

Same external behaviour, fewer DB hits under contention.

Co-Authored-By: stegaBOB <41593264+stegaBOB@users.noreply.github.com>
@stegaBOB stegaBOB force-pushed the stegaBOB/feat/v2-cleanup branch from 25de2d7 to 72517e6 Compare May 25, 2026 19:41
stegaBOB added 7 commits May 28, 2026 17:40
`check_is_verified` (/status) filters builds to trusted signers, but
`get_verification_status_all` (/verified-programs-status) didn't -- it
returned any completed build's repo/commit, so an untrusted signer could
surface its metadata there. v1 routed this endpoint through
check_is_verified per program, so it was trust-filtered; this restores
that.

Extracts `onchain::trusted_signers()` (system_program::ID + SIGNER_KEYS)
now that two callers need it, and uses it in both. The live upgrade
authority stays matched in SQL via program_state.authority.

Co-Authored-By: stegaBOB <41593264+stegaBOB@users.noreply.github.com>
NewBuild::from(&OtterBuildParams) left signer None, so every caller
(both verify handlers, pda_worker) had to set it afterward -- a footgun
where forgetting yields a null signer. The PDA is derived from its
signer and the on-chain program stores that same key in the params, so
it's authoritative: From now sets it directly.

Drops the redundant overrides and the separately-threaded signer:
setup_verification returns just the NewBuild (no more VerificationSetup
struct), and process_verification / process_verification_sync lose their
signer parameter.

Co-Authored-By: stegaBOB <41593264+stegaBOB@users.noreply.github.com>
The sweep now backstops missed /pda webhooks. When a refresh observes a
program's on-chain hash change, upsert_program_state flags
pending_reverify; the sweep then drains up to max_reverifies_per_sweep
flagged programs (config, default 3), fetches each one's current Otter
Verify PDA, and kicks a build through the same execute path as the
verify endpoints -- unless an identical build already exists (any
status, so failures aren't retried).

The flag persists across cycles, so a capped burst drains over several
sweeps rather than being dropped, and is cleared once a program is
handled so stuck programs aren't re-examined until they drift again.
unverify_program also sets the flag, since it advances the stored hash
itself and the sweep's drift check would otherwise miss it.

Adds the pending_reverify column to program_state, a has_build_for_params
guard (any-status sibling of find_duplicate, sharing one query via an
include_failed toggle), and the AppState plumbing to give the sweep
loop what execute needs.

Co-Authored-By: stegaBOB <41593264+stegaBOB@users.noreply.github.com>
Two small correctness fixes carried over from the v2 rewrite:

- Hash the program-data bytecode when the account is exactly the header
  size (`>=` not `>`), matching `solana-verify`'s `data.get(45..)`.
  Only affects a zero-bytecode account, but it was an off-by-one.
- Trim the `/verified-programs` search term before building the ILIKE
  pattern, so a space-padded (but valid) query still matches.

Co-Authored-By: stegaBOB <41593264+stegaBOB@users.noreply.github.com>
Two v1-parity fixes for behavior the rewrite changed silently:

- /unverify: skip programs with no completed build. Helius watches every
  program upgrade, so the handler fires for programs we never verified;
  the upsert in unverify_program would otherwise create a junk
  program_state row for each (and the sweep would then chase it). v1
  bailed here implicitly -- its get_verified_build errored for unknown
  programs and its UPDATE was a no-op -- so restore that gate explicitly.

- /verified-programs and /verified-programs-status: stop excluding
  frozen programs (keep excluding closed). /status already reports a
  frozen-but-matching program as verified, so the lists now agree.
  Immutable (legacy-loader) programs are frozen and shouldn't be hidden
  from the directory.

Co-Authored-By: stegaBOB <41593264+stegaBOB@users.noreply.github.com>
Return None for a program with no program_state row (or NULL hash)
rather than an empty string. The "" sentinel made "untracked" and
"tracked, hash drifted" indistinguishable, which different callers want
to treat differently. Each caller now handles None explicitly:

- pda_worker / unverify: None (no cached hash) compares unequal to a
  real fetched hash, same as before -- behavior unchanged.
- job_status: falls back to "" for the response field.

Pure refactor; no behavior change.

Co-Authored-By: stegaBOB <41593264+stegaBOB@users.noreply.github.com>
@stegaBOB stegaBOB force-pushed the stegaBOB/feat/v2-cleanup branch 3 times, most recently from df0f494 to 448311a Compare June 1, 2026 23:20
stegaBOB and others added 3 commits June 1, 2026 18:23
Dropping verified_programs / program_authority / solana_program_builds in
0001 -- which runs automatically at app startup -- made a bad v2 deploy
unrecoverable without restoring from backup. Keep the v1 tables so the data
survives the cutover (a rollback to the v1 binary still has it), and document
the deploy steps and the deferred manual drop in the migration file itself.

Addresses review feedback on otter-sec#125.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: stegaBOB <41593264+stegaBOB@users.noreply.github.com>
The v2 rewrite had a handful of spots reaching across modules via the
full `crate::module::Thing` path inline -- including one same-file
self-reference in `api/responses.rs` (`crate::api::responses::build_repository_url`
called from inside the file that defines it). Hoisting them into `use`
statements at the top of each file:

- src/api/responses.rs: imports `BuildRow`, `ProgramStateRow`, `Address`;
  drops the self-reference, just calls `build_repository_url(...)`.
- src/db.rs: imports the response types and `OtterBuildParams` it builds
  from.
- src/sweep.rs: imports `BackgroundJobHealth`, `BackgroundJobStatus`,
  and `Result`; removes the in-function `use` inside `get_health_status`.
- src/onchain/state.rs: imports `Address`.
- src/build/mod.rs: imports `JobStatus`, `is_program_data_missing`,
  `snapshot_programs`.
- src/api/handlers/{pda_worker,sync_verify,health}.rs: imports
  `ApiError`, `Address`, `BackgroundJobHealth`, `VerifyResponse`,
  `JobStatus`.

No behaviour change, just naming. `cargo fmt --check`, `cargo clippy
-D warnings`, `cargo sort --check`, `cargo machete` all clean.

Co-Authored-By: stegaBOB <41593264+stegaBOB@users.noreply.github.com>
@stegaBOB stegaBOB force-pushed the stegaBOB/feat/v2-cleanup branch from 448311a to b6602ae Compare June 1, 2026 23:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants