Skip to content

apollo_consensus_orchestrator: remove some panics from the validate proposal flow #8048

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 24, 2025

Conversation

matan-starkware
Copy link
Contributor

Failures here can simply result in the proposal being treated as invalid.

@reviewable-StarkWare
Copy link

This change is Reviewable

Copy link
Contributor Author

matan-starkware commented Jul 16, 2025

@matan-starkware matan-starkware marked this pull request as ready for review July 16, 2025 13:35
@matan-starkware matan-starkware force-pushed the matan/clean_validate_proposal_panics branch from 4c5c340 to 0d5d643 Compare July 16, 2025 13:44
@matan-starkware matan-starkware force-pushed the matan/clean_context_panics branch from 36e46e8 to 8e5a64b Compare July 16, 2025 13:44
Copy link
Contributor

@AlonLStarkWare AlonLStarkWare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @matan-starkware)

Copy link
Contributor

@AlonLStarkWare AlonLStarkWare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @matan-starkware)

Copy link
Collaborator

@ShahakShama ShahakShama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @matan-starkware)


crates/apollo_consensus_orchestrator/src/validate_proposal.rs line 427 at r2 (raw file):

                ProposalStatus::Finished(id) => id,
                ProposalStatus::InvalidProposal(err) => return HandledProposalPart::Invalid(err),
                status => panic!("Unexpected status: for {proposal_id:?}, {status:?}"),

@dan-starkware WDYT? I think it makes sense to keep this panic since it should never happen. Maybe use unreachable!() instead


crates/apollo_consensus_orchestrator/src/validate_proposal.rs line 493 at r2 (raw file):

                ProposalStatus::Processing => HandledProposalPart::Continue,
                ProposalStatus::InvalidProposal(err) => HandledProposalPart::Invalid(err),
                status => {

@dan-starkware same here


crates/apollo_consensus_orchestrator/src/validate_proposal.rs line 528 at r2 (raw file):

        .await
        // This can occur if the batcher already aborted the proposal itself.
        .inspect_err(|e| warn!("Failed to send Abort to batcher: {e:?}"));

You need to check here what the error is. If it's a client error you should retry. Try and see when it's a batcher error if you can analyze if it failed due to already aborted (and then do what you do now) or due to something else (and then panic)

@dan-starkware
Copy link
Collaborator

crates/apollo_consensus_orchestrator/src/validate_proposal.rs line 427 at r2 (raw file):

Previously, ShahakShama wrote…

@dan-starkware WDYT? I think it makes sense to keep this panic since it should never happen. Maybe use unreachable!() instead

Agree with unreachable. Otherwise, what should we do with this propagated error?

@dan-starkware
Copy link
Collaborator

crates/apollo_consensus_orchestrator/src/validate_proposal.rs line 493 at r2 (raw file):

Previously, ShahakShama wrote…

@dan-starkware same here

Same as above (unreachable). Otherwise, what should we do with this propagated error?

@AlonLStarkWare AlonLStarkWare changed the base branch from matan/clean_context_panics to graphite-base/8048 July 24, 2025 10:54
@AlonLStarkWare AlonLStarkWare force-pushed the matan/clean_validate_proposal_panics branch from 0d5d643 to 11e3115 Compare July 24, 2025 10:54
@AlonLStarkWare AlonLStarkWare changed the base branch from graphite-base/8048 to main-v0.14.0 July 24, 2025 10:54
Copy link
Contributor

@AlonLStarkWare AlonLStarkWare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 1 unresolved discussion (waiting on @ShahakShama)


crates/apollo_consensus_orchestrator/src/validate_proposal.rs line 427 at r2 (raw file):

Previously, dan-starkware wrote…

Agree with unreachable. Otherwise, what should we do with this propagated error?

Done


crates/apollo_consensus_orchestrator/src/validate_proposal.rs line 493 at r2 (raw file):

Previously, dan-starkware wrote…

Same as above (unreachable). Otherwise, what should we do with this propagated error?

Done


crates/apollo_consensus_orchestrator/src/validate_proposal.rs line 528 at r2 (raw file):

Previously, ShahakShama wrote…

You need to check here what the error is. If it's a client error you should retry. Try and see when it's a batcher error if you can analyze if it failed due to already aborted (and then do what you do now) or due to something else (and then panic)

Done
However from what I can tell batcher.send_proposal_content will never return a BatcherError::ProposalAborted

Copy link
Collaborator

@ShahakShama ShahakShama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: 0 of 1 files reviewed, 2 unresolved discussions (waiting on @AlonLStarkWare)


crates/apollo_consensus_orchestrator/src/validate_proposal.rs line 531 at r3 (raw file):

    let input = SendProposalContentInput { proposal_id, content: SendProposalContent::Abort };

    const MAX_CLIENT_RETRIES: usize = 2;
  1. I think you should loop eternally
  2. If you disagree with 1, panic after max retries
  3. If you disagree with 1, make this a config and increase it to around 10

@AlonLStarkWare AlonLStarkWare force-pushed the matan/clean_validate_proposal_panics branch from 11e3115 to 03eea86 Compare July 24, 2025 11:47
Copy link
Contributor

@AlonLStarkWare AlonLStarkWare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 1 unresolved discussion (waiting on @ShahakShama)


crates/apollo_consensus_orchestrator/src/validate_proposal.rs line 531 at r3 (raw file):

Previously, ShahakShama wrote…
  1. I think you should loop eternally
  2. If you disagree with 1, panic after max retries
  3. If you disagree with 1, make this a config and increase it to around 10

I don't like the idea of an eternal loop here.
Changed to panic after max retries, but I think this is a bit over specific to be a config. Maybe we should consider a broader refactor for client calls, so the retries and max number of retries are universal.

Copy link
Collaborator

@ShahakShama ShahakShama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @AlonLStarkWare)


crates/apollo_consensus_orchestrator/src/validate_proposal.rs line 531 at r3 (raw file):

Previously, AlonLStarkWare (Alon-Lukatch-Starkware) wrote…

I don't like the idea of an eternal loop here.
Changed to panic after max retries, but I think this is a bit over specific to be a config. Maybe we should consider a broader refactor for client calls, so the retries and max number of retries are universal.

Yes. Could you add that as a TODO?

Copy link
Contributor

@AlonLStarkWare AlonLStarkWare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ShahakShama)


crates/apollo_consensus_orchestrator/src/validate_proposal.rs line 531 at r3 (raw file):

Previously, ShahakShama wrote…

Yes. Could you add that as a TODO?

Sure, who should I put this on?

…roposal flow

Failures here can simply result in the proposal being treated as invalid.
@AlonLStarkWare AlonLStarkWare force-pushed the matan/clean_validate_proposal_panics branch from 03eea86 to f7c776e Compare July 24, 2025 12:29
Copy link
Contributor

@AlonLStarkWare AlonLStarkWare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r4, 1 of 1 files at r5, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @matan-starkware)

@AlonLStarkWare AlonLStarkWare added this pull request to the merge queue Jul 24, 2025
Merged via the queue into main-v0.14.0 with commit b22673d Jul 24, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants