Skip to content

Lettuce Sharded PubSub Resubscribe Possible Issue #3213

@ThePeterLuu

Description

@ThePeterLuu

Bug Report

I'm experiencing an issue that seems similar to #2940 in that Lettuce does not seem to be resubscribing Sharded PubSub subscriptions automatically, except I'm using Lettuce 6.5.5.RELEASE, in which the referenced issue should have been fixed.

I may be misunderstanding how things are supposed to work or have misconfigured something, so if that's the case, please let me know.

Current Behavior

Assume that you have a number of applications connected to a Redis cluster with two shards, using .connectPubSub(...).async().ssubscribe(topic).... The subscriptions are distributed across the two shards.

Then, remove one shard (either manually, or via autoscaling policy, such as in an AWS ElastiCache deployment).

By watching debug logs, it seems to me that when subscriptions are made with the regular non-sharded .subscribe call, once Lettuce is disconnected from the shard going-away, and reconnects to the new shard, Lettuce issues another SUBSCRIBE command. This can also be verified by connecting to the Redis cluster via CLI and running CLIENT LIST (I am able to see the connection that was transferred over, and that the latest command run on that connection was subscribe) and with PUBSUB CHANNELS.

However, when subscriptions are made with the sharded .ssubscribe, Lettuce is able to reconnect to the new shard, but there are no debug logs indicating a SSUBSCRIBE command was made. Connecting via CLI shows with CLIENT LIST that the application did successfully reconnect to the new shard, but the latest command is cluster|myid instead of ssubscribe and PUBSUB SHARDCHANNELS shows only the subscriptions that were originally created on that shard and none of the transferred connections.

This difference in behavior (where SUBSCRIBE reconnects successfully, but SSUBSCRIBE does not) also applies to test-initiated failovers (initiated via the AWS ElastiCache console), with the same outcome.

The result is that some number of Sharded PubSub messages are lost because there are no active subscribers for those messages.

Input Code

I can paste my Lettuce client configuration if desired or if that would be helpful, in case you think this might be a problem with my configuration.

Expected behavior/code

Since SUBSCRIBE seems to automatically resubscribe on failovers and auto-scale-in for Redis clusters, I would have expected SSUBSCRIBE to also do the same.

Environment

  • Lettuce version(s): 6.5.5.RELEASE
  • Redis version: 7.2.4 (Valkey 8.0.1)

Additional context

As a tangentially related question, does Lettuce handle slot movement/rebalancing for SSUBSCRIBE, such as when nodes are added and slots are redistributed? I couldn't really find documentation around how slot movement in Sharded Pub/Sub works in general and I'm not familiar enough with Lettuce and Redis to figure it out from reading the code, though I gave it a try. Mostly my concern is if it's something I'd need to implement myself, or if that's handled by using Lettuce.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions