-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Description
Summary
The current RedisItemReader implementation suffers from the classic N+1 problem, executing individual GET commands for each key during batch processing. This creates significant performance bottlenecks due to excessive network round-trips when processing large datasets.
For example, processing 1,000 Redis keys results in 1,000 separate network calls, causing:
- High latency due to network round-trip time (RTT) multiplication
- Excessive load on Redis server
- Poor performance scaling with dataset size
Problem Description
Current Implementation Issue:
@Override
public V read() throws Exception {
if (this.cursor.hasNext()) {
K nextKey = this.cursor.next();
return this.redisTemplate.opsForValue().get(nextKey); // 💀 Individual GET call for each key
}
return null;
}
Performance Impact:
- 1,000 keys → 1,000 network round-trips
- Total processing time = Network latency × Key count
- Redis server experiences request flooding
Proposed Solution
Add a batchSize parameter to RedisItemReader and RedisItemReaderBuilder that enables batch processing via Redis MGET operations while maintaining complete backward compatibility.
Key Features:
- Default batchSize = 1 preserves existing behavior (100% backward compatible)
- batchSize > 1 enables optimized batch processing using Redis MGET
- Internal buffering via queue for efficient data handling
- Significant performance improvement with configurable memory usage
Expected Performance Improvement:
- 1,000 keys with batchSize = 100: 90% reduction in network calls (10 vs 1,000)
- 1,000 keys with batchSize = 1000: 99% reduction in network calls (1 vs 1,000)
Implementation Approach
Enhanced RedisItemReader:
- Add batchSize field with default value of 1
- Implement conditional logic: single-key mode vs batch mode
- Use internal Queue for buffering batch results
- Leverage RedisTemplate.opsForValue().multiGet() for batch operations
Enhanced RedisItemReaderBuilder:
- Add batchSize(int batchSize) method with validation
- Pass batchSize to RedisItemReader constructor
- Comprehensive JavaDoc documentation
Usage Examples:
// Backward compatible (existing behavior)
RedisItemReader<String, Object> reader = new RedisItemReaderBuilder<String, Object>()
.redisTemplate(template)
.scanOptions(scanOptions)
.build(); // Uses batchSize = 1
// Optimized batch processing
RedisItemReader<String, Object> optimizedReader = new RedisItemReaderBuilder<String, Object>()
.redisTemplate(template)
.scanOptions(scanOptions)
.batchSize(100) // Process 100 keys per MGET
.build();
Benefits
- Performance Enhancement: Dramatic reduction in network overhead
- Backward Compatibility: Zero breaking changes to existing code
- Configurable Optimization: Adjustable batch size based on requirements
- Memory Efficiency: Controlled memory usage through batch size limits
- Spring Framework Alignment: Follows Spring's progressive enhancement philosophy
Technical Considerations
Memory Management:
- Batch size validation (1-1000 range) prevents OOM issues
- Internal queue management with proper cleanup
- Null value filtering for memory efficiency
Redis Limitations Awareness:
- Maintains existing limitations (no restart capability due to SCAN nature)
- Preserves idempotency requirements
- Handles duplicate key scenarios appropriately
Additional Context
This enhancement addresses a fundamental performance issue in Spring Batch's Redis integration while maintaining the framework's commitment to backward compatibility. The implementation follows Spring's architectural principles and provides a foundation for future Redis-related optimizations.
Related Issues:
- Classic N+1 problem pattern in ORM and data access layers
- Network optimization in distributed systems
- Batch processing performance in Spring Batch
Environment:
- Spring Batch version: 5.1+
- Spring Data Redis: 3.x+
- Java: 17+
"Terminated by KILL-9 SQUAD 💀"