Fix sentinel reconnect loop due to unresponsive sentinel #76

dhruvjain99 · 2023-02-13T03:57:48Z

Specifically handle connection errors for sentinel process to prevent termination.

Fixes #75

…st in the endpoints list

dhruvjain99 · 2023-02-13T04:00:18Z

@zuiderkwast Please have a look.

zuiderkwast

OK, I understand the problem. I have some small suggestions.

To be honest, trap_exit is not the perfect way to handle processes. It is easy to miss something. It would be better to use a supervisor. Maybe we can do that in the future...

zuiderkwast · 2023-02-13T13:36:38Z

src/eredis_sentinel.erl

 %% Current sentinel connection broken
+handle_info({'EXIT', _Pid, {connection_error, _}}, S) ->
+    {noreply, S};
 handle_info({'EXIT', Pid, _Reason}, #eredis_sentinel_state{conn_pid = Pid} = S) ->
    {noreply, S#eredis_sentinel_state{conn_pid = undefined}};


When connection fails with reason econnrefused, eredis:start_link returns {error, econnrefused} and at the same time, the eredis_client exits with reason {connection_error, econnrefused}. The 'EXIT' message waits in the inbox and query_master connects to another Redis Sentinel node and stores the successful pid in conn_pid. When handle_info is called, the conn_pid doesn't match.

Is this correct?

I think we need a better way to distinguish between this 'EXIT' message and another linked process exiting. Maybe it's better to do an explicit receive {'EXIT', _, {connection_error, _}} -> ok end after eredis:start_link returns an error, just before we try the next connection?

zuiderkwast · 2023-02-13T14:00:49Z

src/eredis_sentinel.erl

+handle_info({'EXIT', Pid, Reason}, S) ->
+    ?LOG_ERROR("eredis_sentinel: Exit from pid ~p with reason ~p and state ~p", [Pid, Reason, S]),
    {stop, normal, S};


Is this always an error?

When eredis is started with sentinel options and later eredis is stopped with eredis:stop(Pid), this message is logged? It's not an error. Also I don't think we should log the state. It looks like debugging.

How about instead exit with the same reason as the linked process? Then an error will be logged automatically if Reason is anything other than normal.

Suggested change

handle_info({'EXIT', Pid, Reason}, S) ->

?LOG_ERROR("eredis_sentinel: Exit from pid ~p with reason ~p and state ~p", [Pid, Reason, S]),

{stop, normal, S};

handle_info({'EXIT', _Pid, Reason}, S) ->

{stop, Reason, S};

zuiderkwast · 2023-02-13T14:05:52Z

test/eredis_sentinel_SUITE.erl

    process_flag(trap_exit, false),
    ?assertEqual(died, IsDead).

+t_reconnect_success_on_sentinel_connection_break_mix_endpoints(Config) when is_list(Config) ->


Please add some comments inside this test case to make it easier to understand what's happening.

Fix sentinel reconnect loop due to unresponsive sentinel which is fir…

1324358

…st in the endpoints list

zuiderkwast reviewed Feb 13, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix sentinel reconnect loop due to unresponsive sentinel #76

Fix sentinel reconnect loop due to unresponsive sentinel #76

Uh oh!

dhruvjain99 commented Feb 13, 2023

Uh oh!

dhruvjain99 commented Feb 13, 2023

Uh oh!

zuiderkwast left a comment

Uh oh!

zuiderkwast Feb 13, 2023

Uh oh!

zuiderkwast Feb 13, 2023

Uh oh!

zuiderkwast Feb 13, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix sentinel reconnect loop due to unresponsive sentinel #76

Are you sure you want to change the base?

Fix sentinel reconnect loop due to unresponsive sentinel #76

Uh oh!

Conversation

dhruvjain99 commented Feb 13, 2023

Uh oh!

dhruvjain99 commented Feb 13, 2023

Uh oh!

zuiderkwast left a comment

Choose a reason for hiding this comment

Uh oh!

zuiderkwast Feb 13, 2023

Choose a reason for hiding this comment

Uh oh!

zuiderkwast Feb 13, 2023

Choose a reason for hiding this comment

Uh oh!

zuiderkwast Feb 13, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants