Skip to content

[automatic-failover] Implement failover retry for in-flight commands #4175

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: feature/automatic-failover
Choose a base branch
from

Conversation

ggivo
Copy link
Collaborator

@ggivo ggivo commented Jun 10, 2025

This change adds support for retrying in-flight commands when a Redis connection fails and the client automatically fails over to another cluster. The feature is configurable through the FailoverOptions builder.

Key changes:

  • Added retryFailedInflightCommands option to FailoverOptions
  • Implemented retry logic in CircuitBreakerCommandExecutor
  • Added integration tests to verify both retry and no-retry behavior
  • Created utility methods for test setup and configuration

This enhancement improves resilience for long-running commands like blocking operations, allowing them to transparently continue on the failover cluster without client-side errors.

@ggivo ggivo requested a review from Copilot June 10, 2025 13:40
Copilot

This comment was marked as outdated.

@ggivo ggivo requested a review from Copilot June 11, 2025 06:05
Copilot

This comment was marked as outdated.

@ggivo ggivo changed the title Implement failover retry for in-flight commands [automatic-failover] Implement failover retry for in-flight commands Jun 11, 2025
@ggivo ggivo self-assigned this Jun 11, 2025
ggivo and others added 3 commits June 13, 2025 11:47
This change adds support for retrying in-flight commands when a Redis connection
fails and the client automatically fails over to another cluster. The feature
is configurable through the FailoverOptions builder.

Key changes:
- Added retryFailedInflightCommands option to FailoverOptions
- Implemented retry logic in CircuitBreakerCommandExecutor
- Added integration tests to verify both retry and no-retry behavior
- Created utility methods for test setup and configuration

This enhancement improves resilience for long-running commands like blocking
operations, allowing them to transparently continue on the failover cluster
without client-side errors.
…cutor.java

Co-authored-by: Copilot <[email protected]>
# Conflicts:
#	src/main/java/redis/clients/jedis/mcf/CircuitBreakerCommandExecutor.java
format & clean-up
@ggivo ggivo force-pushed the automatic-failover-retry-on-failover branch from 21103fe to adf7f57 Compare June 13, 2025 09:03
ggivo added 2 commits June 13, 2025 12:45
…lover

    blpop timeout should be less than async command timeout to prevent completing with java.util.concurrent.TimeoutException instead of actuall command failure
   - rename retryFailedInflightCommands->retryOnFailover
   - remove check for CB in OPEN state
@ggivo ggivo requested a review from atakavci June 13, 2025 19:28
@ggivo
Copy link
Collaborator Author

ggivo commented Jun 13, 2025

Current implementation performs actual failover, on first incoming command after CB threshold is reached.
Discussed with @atakavci that we need to perform failover imediately when CB switches to OPEN state.
Will work on this with separate PR.

@ggivo ggivo requested a review from Copilot June 13, 2025 19:29
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request implements automatic failover for in-flight Redis commands by adding a new retry option and updating related components to support configurable retry behavior during failover. Key changes include:

  • Adding the new "retryOnFailover" option with its builder in FailoverOptions.
  • Implementing retry logic in CircuitBreakerCommandExecutor.
  • Updating UnifiedJedis to use the new options and adding integration tests for both retry-enabled and retry-disabled scenarios.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/test/java/redis/clients/jedis/failover/FailoverIntegrationTest.java Updated integration tests to verify automatic and manual failover behaviors and added tests for retry scenarios.
src/main/java/redis/clients/jedis/mcf/FailoverOptions.java Introduced new configuration option for failover retry behavior.
src/main/java/redis/clients/jedis/mcf/ConnectionFailoverException.java Added a specialized exception for failover command failures.
src/main/java/redis/clients/jedis/mcf/CircuitBreakerCommandExecutor.java Integrated the retryOnFailover option in handling command execution and failover logic.
src/main/java/redis/clients/jedis/UnifiedJedis.java Updated constructors to incorporate the new failover options.
src/main/java/redis/clients/jedis/MultiClusterClientConfig.java Updated fallback exception configuration to include ConnectionFailoverException.

@ggivo
Copy link
Collaborator Author

ggivo commented Jun 16, 2025

@atakavci updated, the review could you take a second look?

* <p>
*/
@Experimental
public UnifiedJedis(MultiClusterPooledConnectionProvider provider, FailoverOptions failoverOptions) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is going to be the only place that we introduce FailoverOptions into UnifiedJedis, how about putting it into MultiClusterClientConfig and may be holding a reference to it in provider itself.
My reasoning here is that retryOnFailover is not going to be the only thing that we need from config and i guess we will need it in provider as well.
Not perfectly sure about it though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ggivo on a second thought, its better to do it later when we actually need it. lets keep it as it is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was my initial thinking, but it is rather setting of the executor, and decided to put it in the Executor, with the idea to add additional configuration settings in future if needed without the need to introduce new constructors..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants