Skip to content

fix(raft): harden election transitions#2

Merged
windsornguyen merged 1 commit into
mainfrom
fix/raft-election-correctness
May 9, 2026
Merged

fix(raft): harden election transitions#2
windsornguyen merged 1 commit into
mainfrom
fix/raft-election-correctness

Conversation

@windsornguyen

Copy link
Copy Markdown
Owner

Pull Request

Summary

What: Harden Raft election transitions around leader contact, learner/non-voter elections, disruptive votes, and stale PreVote responses.

Why: The dissertation's safety arguments depend on only eligible voters participating in quorums and on followers not inflating term or granting votes while an active leader is maintaining heartbeats.

Lines added: +160

Thesis Cross-Reference

  • Candidate step-down on legitimate leader contact:
    basicraft/consensus.tex#L194-L200
    Key phrase: "returns to follower state".
    Context: a candidate receiving same-or-higher-term AppendEntries recognizes that leader, but lower-term AppendEntries must be rejected.

  • RequestVote log freshness:
    basicraft/consensus.tex#L500-L512
    Key phrase: "denies its vote".
    Context: voting is restricted to candidates whose logs are at least as up-to-date as the voter.

  • Non-voting members:
    membership/availability.tex#L82-L87
    Key phrase: "not yet counted towards majorities".
    Context: learners receive replication but must not contribute votes or commitment quorum.

  • Removed/non-member election self-vote:
    membership/availability.tex#L228-L238
    Key phrase: "does not count its own vote".
    Context: a server outside its latest config may start an election, but its own vote only counts if it is in that config.

  • Disruptive vote guard:
    membership/availability.tex#L315-L323
    Key phrase: "does not update its term".
    Context: recent leader heartbeats make vote requests disruptive; the server may deny/drop/delay without adopting a higher term.

Test Plan

  • cargo test -p cloud9-raft

Tests Added

  • Non-voter candidates and pre-candidates do not count self-votes.
  • Candidates do not falsely ACK higher-term AppendEntries while stepping down.
  • Followers deny votes to candidates outside the active configuration.
  • Followers reject disruptive votes without term inflation.
  • Stale PreVote responses are ignored.

Notes for Reviewers

The subtle case is the candidate step-down path: returning follower state is correct, but sending a successful AppendResponse before the AppendEntries consistency check is not.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d0526314e9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +37 to +41
let initial_contact = if leader.is_some() {
core.ticks
} else {
core.ticks.saturating_sub(core.config.election_timeout.0)
};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid marking leader contact before AppendEntries is validated

Follower::new now treats any transition with leader: Some(_) as a fresh leader heartbeat by setting last_contact_tick = core.ticks. In this patch, candidates/leaders step down on a higher-term AppendRequest before that request is processed for log consistency, so an unverified/stale Append can immediately activate recently_heard_from_leader() and cause subsequent VoteRequest/PreVoteRequest messages to be rejected without term update for up to an election timeout. This can delay or repeatedly stall leadership elections under stale or removed senders that emit higher-term appends.

Useful? React with 👍 / 👎.

@windsornguyen windsornguyen force-pushed the fix/raft-election-correctness branch from d052631 to d2ce1a7 Compare May 9, 2026 03:40
@windsornguyen windsornguyen force-pushed the fix/rename-cloud9-raft branch from 3a9a6bf to b2fb2de Compare May 9, 2026 03:45
@windsornguyen windsornguyen force-pushed the fix/raft-election-correctness branch from d2ce1a7 to aac29dc Compare May 9, 2026 03:45
@windsornguyen windsornguyen force-pushed the fix/raft-election-correctness branch from aac29dc to c64e06b Compare May 9, 2026 07:00
@windsornguyen windsornguyen changed the base branch from fix/rename-cloud9-raft to main May 9, 2026 07:05
@windsornguyen windsornguyen merged commit 0c0ed13 into main May 9, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant