@@ -375,6 +375,92 @@ def __init__(
375
375
- Example: 100ms RTT + SSL = ~300-500ms handshake time
376
376
- Consider TLS session resumption to reduce reconnection overhead
377
377
378
+ socket_keepalive (Optional[bool], default=None):
379
+ Main description: Enable TCP keepalive to detect dead connections.
380
+
381
+ What is TCP keepalive:
382
+ TCP keepalive is a mechanism where the operating system periodically sends
383
+ small probe packets on idle connections to verify the remote endpoint is
384
+ still reachable. If the remote side doesn't respond after several probes,
385
+ the connection is considered dead and closed. This happens at the TCP level,
386
+ below the application layer.
387
+
388
+ Why keepalive is needed:
389
+ Redis keeps connections open indefinitely by default (if the timeout config is set to 0), but network
390
+ issues, client crashes, or intermediate devices (firewalls, NAT, proxies) can
391
+ cause "half-open" connections where one side thinks the connection is alive
392
+ but the other side is unreachable. Without keepalive, these dead connections
393
+ can accumulate and consume resources until manually detected.
394
+
395
+ How keepalive improves reconnection:
396
+ When keepalive detects a dead connection, the socket is closed immediately.
397
+ This means reconnection attempts are much faster because redis-py won't waste
398
+ time retrying operations on a dead connection and waiting for timeouts.
399
+ Instead, it quickly establishes a new connection.
400
+
401
+ Recommended values:
402
+ - Production systems: True (recommended for all connections)
403
+ - Connection pools: True (essential - affects all pool connections)
404
+ - Development/testing: False or None (for simplicity)
405
+ Trade-offs:
406
+ - True: Detects dead connections but uses more network resources (only during idle periods)
407
+ - False: Lower network overhead but may not detect connection failures
408
+ Related parameters: socket_keepalive_options, health_check_interval
409
+ Common issues:
410
+ - Firewall interference: Some firewalls drop keepalive packets
411
+ - Resource usage: Keepalive packets consume bandwidth
412
+ - Timing conflicts: May conflict with application-level health checks
413
+ - NAT timeouts: Helps prevent NAT table entry expiration
414
+
415
+ socket_keepalive_options (Optional[Mapping[int, Union[int, bytes]]], default=None):
416
+ Main description: Advanced TCP keepalive socket options.
417
+
418
+ Available options reference:
419
+ - Python socket module: import socket; help(socket) or dir(socket)
420
+ - Common constants: socket.TCP_KEEPIDLE, socket.TCP_KEEPINTVL, socket.TCP_KEEPCNT
421
+ - Platform-specific: socket.TCP_KEEPALIVE (macOS), socket.TCP_USER_TIMEOUT (Linux)
422
+ - Online reference: https://docs.python.org/3/library/socket.html#socket-families
423
+ - System documentation: man 7 tcp (Linux), man 4 tcp (BSD/macOS)
424
+
425
+ Recommended values:
426
+ - Linux: {socket.TCP_KEEPIDLE: 30, socket.TCP_KEEPINTVL: 10, socket.TCP_KEEPCNT: 3}
427
+ - macOS: {socket.TCP_KEEPALIVE: 30, socket.TCP_KEEPINTVL: 10, socket.TCP_KEEPCNT: 3}
428
+ - Windows: {socket.TCP_KEEPIDLE: 30, socket.TCP_KEEPINTVL: 10, socket.TCP_KEEPCNT: 3}
429
+ - Default: None (use system defaults)
430
+ - Custom: Tune based on network characteristics
431
+
432
+ How to discover available options:
433
+ ```python
434
+ import socket
435
+ # List all TCP-related constants
436
+ tcp_options = [attr for attr in dir(socket) if attr.startswith('TCP_')]
437
+ print(tcp_options)
438
+
439
+ # Check if specific option exists on your platform
440
+ if hasattr(socket, 'TCP_KEEPIDLE'):
441
+ print(f"TCP_KEEPIDLE = {socket.TCP_KEEPIDLE}")
442
+
443
+ # Example configuration for 30-second keepalive
444
+ keepalive_opts = {socket.TCP_KEEPIDLE: 30, socket.TCP_KEEPINTVL: 10, socket.TCP_KEEPCNT: 3}
445
+ ```
446
+
447
+ Trade-offs:
448
+ - Custom options: Fine-tuned detection but platform-specific
449
+ - System defaults: Portable but may not be optimal
450
+ Related parameters: socket_keepalive (must be True)
451
+ Use cases:
452
+ - High-availability systems: Aggressive keepalive settings
453
+ - Satellite/slow networks: Longer intervals
454
+ - Container environments: Shorter intervals for faster detection
455
+ Common issues:
456
+ - Platform differences: Options vary between OS (use hasattr() to check)
457
+ - Invalid options: May cause socket creation to fail
458
+ - Firewall interference: Aggressive settings may be blocked
459
+ - Constant availability: Not all TCP options available on all platforms
460
+ Performance implications:
461
+ - More frequent keepalive packets increase network usage
462
+ - Faster dead connection detection improves reliability
463
+
378
464
To specify a retry policy for specific errors, you have two options:
379
465
380
466
1. Set the `retry_on_error` to a list of the error/s to retry on, and
0 commit comments