Skip to content

Conversation

@BenWestgate
Copy link
Contributor

@BenWestgate BenWestgate commented Jan 8, 2026

codex32: String length limits now cover HRP characters, tighter master seed bit length limits, updated master seed encoding/decoding processes.

Rationale

Helper PR for #2040: prepares the text for general HRP lengths by defining limits based on string length not data part length and deprecating "ms" seed lengths that violate the new codex32 length rule.

Why
Restricting seeds to 32-bit multiples makes valid secret seed lengths differ by at least 6-7 characters reducing ambiguity to two valid lengths for insert/delete-correcting error correcting wallets. Restricting to 64-bit multiples has one valid length within correctable distance but @roconnor-blockstream wants 160-bit seeds.

Key changes (concise) c286c2c

  • Overall string max length for regular codex32: explicitly limited to 94 chars. Payload: changed from “up to 74” → up to 72 bech32 characters in the regular format.
  • Long codex32: new length window (97–1024 characters) and payload allowance relaxed (up to 1001 bech32 chars).
  • Clarify that the checksum covers low HRP+data per the BIP-0173 error model.

My Summary of other Changes:

  • Remove "decode" from Unshared Secret as this process should be application specific once HRP is generalized.
  • Master seed format changes:
    • Add "decode" section.
    • Restrict seed lengths to 32-bit multiples to remove the "ms" exception when switching between regular and long format. Leaves our desired 128-, 160-, 192-, 224-, 256-bitlengths as well as many unrequested lengths from 288- to 480-bits.
      • Note removing between 256 to 512-bits (exclusive) can remove or simplify a couple sentences, and improve insert/delete-correction of 256- and 512-bit seeds. Removing 224-bit improves the insert/delete correction of 192- and 256-bit seeds.
    • Recommend how to derive unspecified identifiers for fresh secret seeds and reshared ones.
    • Recommend deterministic implementations derive padding using a defined CRC code.
  • Defined a shared string as a non-"0" threshold parameter codex32 string.
    • Defined how to "decode" these as implementers have been surprised naked share payload bytes are insufficient to recover the secret byte.
  • Defined a "master share set" as a k shared strings valid set generated with exactly k-1 fresh shares in accordance with that section. (needed for deriving reshare identifiers.)
  • Added a couple <ref> notes for the less obvious changes.
  • Fixed missing /ref so Footnotes now appear at the end of Rationale (they're invisible in master)
  • Bolded all section references.

Backward-compatibility and migration

  • Existing codex32 strings that encode typical data lengths remain valid; however implementations that:
    • Accepted codex32 strings 95 and 96 characters long may need to be adjusted. We may consider validating both lengths for a deprecation window while adopting new encoding behavior.
    • Rejected Long codex32 data-parts below 81 characters may need to be adjusted as the new HRP coverage allows them.
    • Rejected codex32 string lengths below 48, 97, 98 and above 127 characters long may need to be to be adjusted as codex32 allows lengths 21 to 94 and 97 to 1024. The Master seed format rejects based on decoded bytes, not characters.
    • Assumed the checksum only covered the data (not HRP) should be reviewed and tested; the text now makes checksum coverage explicit to match the bech32 error model.

Test Vectors
I have a working reference implementation that can validate according to this text's spec as well as #2040's plan so I will be able to generate these and add them after reviewers agree about the text changes. After a round or so of reviews I'll mark this draft and add Vectors and my passing reference implementation.

The Master seed format deterministic encoding recommendations both need vectors. As do now invalidated "ms" lengths that are not multiples of 4 bytes.

Proposed reviewer checklist

  • Confirm the textual changes accurately limit/permit the correct set of lengths (regular and long).
  • Confirm the encoding/decoding steps are implementable and are referenced accurately.
  • Assess backward compatibility strategy for wallets and libraries that previously accepted/rejected different length limits. (I doubt anyone used 45 byte seeds but it's worth asking before invalidating.)
  • TODO: Run unit tests against reference vectors (add vectors for boundary lengths and decode-edge cases).

Clarified codex32 string specifications, including character limits and decoding processes. Updated references to BIP-0173 and adjusted payload character limits.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant