Skip to content

Conversation

@JSv4
Copy link
Collaborator

@JSv4 JSv4 commented Dec 27, 2025

Implements per-user and per-IP rate limiting for WebSocket connections, mirroring the existing GraphQL rate limiting infrastructure.

Changes:

  • Add new rate limiting module (config/websocket/ratelimits.py) with check_rate_limit functions, WebSocketRateLimits config class, and RateLimitedConsumerMixin
  • Add rate limiting middleware for connection-level limits that rejects excessive connection attempts with close code 4029
  • Add message-level rate limiting to all WebSocket consumers: UnifiedAgentConsumer, DocumentQueryConsumer, CorpusQueryConsumer, StandaloneDocumentQueryConsumer, and ThreadUpdatesConsumer
  • Add configurable rate limit settings via environment variables
  • Integrate rate limiting middleware into ASGI stack after authentication
  • Add comprehensive test suite for rate limiting utilities and integration

Rate limits:

  • Authenticated users: 30 connections/min, 60 messages/min, 20 AI queries/min
  • Anonymous users: 10 connections/min, 20 messages/min, 5 AI queries/min
  • Superusers get 5x the normal limits

Closes #730

@claude
Copy link

claude bot commented Dec 27, 2025

Code Review: WebSocket Rate Limiting Implementation

This is a well-designed and thorough implementation of WebSocket rate limiting. The code demonstrates excellent architectural decisions, strong security practices, and comprehensive test coverage. Below is my detailed review.


✅ Strengths

1. Excellent Architecture & Design

  • Two-level rate limiting: Connection-level (middleware) + message-level (consumers) provides defense in depth
  • Consistent with existing patterns: Mirrors the GraphQL rate limiting infrastructure, making it familiar to the team
  • Proper separation of concerns: Utilities, middleware, and consumer integration are cleanly separated
  • DRY implementation: Shared _check_message_rate_limit method across all consumers eliminates duplication

2. Security Best Practices

  • Defense in depth: Multiple layers prevent both connection flooding and message spam
  • Per-user tracking for authenticated users: Prevents account enumeration via IP-based limits
  • Per-IP tracking for anonymous users: Proper fallback for unauthenticated connections
  • Superuser handling: 5x limits for superusers is reasonable and configurable
  • Fail-safe error handling: RATELIMIT_FAIL_OPEN setting allows graceful degradation
  • Proper logging: Security events are logged with sufficient context for monitoring

3. Code Quality

  • Clear, comprehensive documentation: Every function and class has excellent docstrings
  • Type hints: Proper use of type annotations throughout
  • Error handling: Robust exception handling with fallback behavior
  • No magic numbers: All rate limits are configurable via environment variables
  • Standards compliance: WebSocket close code 4029 is correctly in the application-reserved range (4000-4999)

4. Test Coverage

  • Comprehensive test suite: 377 lines of tests covering utilities, integration, and edge cases
  • Unit tests: Parse rate, IP extraction, key generation all tested
  • Integration tests: Real WebSocket connections with rate limit verification
  • Mocking strategy: Proper use of mocks to isolate rate limit behavior
  • Marked serial: Correctly uses @pytest.mark.serial for WebSocket tests to avoid pytest-xdist conflicts

5. Excellent Changelog

  • Follows project conventions: Uses Keep a Changelog format
  • Comprehensive documentation: Includes file paths, line numbers, technical details
  • User-facing documentation: Clearly explains rate limits and behavior

🔍 Issues Found

CRITICAL: Race Condition in Cache Operations

Location: config/websocket/ratelimits.py:163-181

Issue: The check_rate_limit function has a race condition between cache.get and cache.incr:

# Line 164: Get current count
current = cache.get(full_key, 0)

is_limited = current >= max_count

if increment and not is_limited:
    if current == 0:
        cache.set(full_key, 1, period_seconds)  # Line 172
    else:
        try:
            cache.incr(full_key)  # Line 176 - RACE CONDITION
        except ValueError:
            cache.set(full_key, 1, period_seconds)

Problem: Between lines 164 and 176, multiple concurrent requests can:

  1. All read the same current value
  2. All see current < max_count
  3. All increment, potentially exceeding the limit

Impact: Under high concurrency, more requests could be allowed through than the configured limit.

Recommended Fix: Use atomic cache operations. Most Django cache backends support cache.get_or_set and atomic increments:

# Atomic increment with TTL
try:
    # Try to add key with value 1 and TTL (atomic, only succeeds if key doesn't exist)
    if cache.add(full_key, 0, period_seconds):
        # Key was added, now increment it
        current = cache.incr(full_key)
    else:
        # Key exists, increment it
        current = cache.incr(full_key)
except ValueError:
    # Key expired between add and incr, retry
    cache.set(full_key, 1, period_seconds)
    current = 1

is_limited = current > max_count

Alternatively, use Redis's INCR command directly for guaranteed atomicity if using Redis backend.


MEDIUM: DRY Violation - Period Name Mapping

Locations: Multiple files have identical period name mapping:

  • config/websocket/ratelimits.py:245-246, 360-362, 459-461
  • config/websocket/middlewares/ratelimit_middleware.py:119-124
  • config/websocket/consumers/corpus_conversation.py:124-126
  • config/websocket/consumers/document_conversation.py:197-199
  • config/websocket/consumers/standalone_document_conversation.py:174-176
  • config/websocket/consumers/thread_updates.py:178-180
  • config/websocket/consumers/unified_agent_conversation.py:245-247

Issue: The period mapping dictionary is duplicated 8+ times:

period_name = {1: "second", 60: "minute", 3600: "hour", 86400: "day"}.get(period, "period")

Recommended Fix: Extract to a utility function in config/websocket/ratelimits.py:

def get_period_name(period_seconds: int) -> str:
    """Convert period in seconds to human-readable name."""
    return {
        1: "second",
        60: "minute",
        3600: "hour",
        86400: "day"
    }.get(period_seconds, "period")

Then import and use: period_name = get_period_name(period)


MEDIUM: Inconsistent Import of RateLimitedConsumerMixin

Location: config/websocket/consumers/unified_agent_conversation.py:56

Issue: RateLimitedConsumerMixin is imported but never used. The consumer implements its own _check_message_rate_limit method instead of using the mixin.

Questions:

  1. Was the mixin intended to be used but forgotten?
  2. If not, why is it imported?

Recommended Action: Either:

  • Remove the unused import (preferred if mixin doesn't fit the architecture)
  • Use the mixin consistently across consumers (if it provides value)

LOW: Missing Return Type Annotation

Location: config/websocket/ratelimits.py:289

Issue: Line 289 returns a string but uses division which could theoretically return a float:

return f"{count * 5}/{period}"  # period is int, but type not enforced

Recommended Fix: Ensure period from parse_rate is always an int (which it is, but type hints would make this explicit).


LOW: Potential for Confusion in Test Coverage

Location: opencontractserver/tests/test_websocket_ratelimits.py:287

Issue: The rate_limit_side_effect function uses a closure variable and string matching:

if "ws_connect" in group or call_count < 3:

Concern: String matching on group is fragile. If the group naming changes, the test will break silently.

Recommended Improvement: Make the test more explicit about what it's checking:

# Allow connection checks and first 2 message checks
if group == "ws_connect":
    return False, {...}  # Always allow connections
elif call_count < 3:
    return False, {...}  # Allow first 2 messages
else:
    return True, {...}  # Rate limit subsequent messages

🎯 Suggestions for Future Enhancements

These are not blockers for this PR, but nice-to-haves for future work:

  1. Rate limit headers in responses: Consider including X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset in the response data for better client-side handling

  2. Metrics/monitoring: Add Prometheus metrics or StatsD counters for rate limit hits to enable operational visibility

  3. Whitelist support: Consider adding a setting for whitelisted IPs or users that bypass rate limits (useful for internal services)

  4. Distributed cache requirement documentation: Document that for multi-instance deployments, a shared cache backend (Redis/Memcached) is required

  5. IPv6 support: The code handles IPv6 addresses, but consider documenting whether RATELIMIT_IPV6_MASK applies to WebSocket rate limiting (it appears to be unused here)


📊 Summary

Category Status
Architecture ✅ Excellent
Security ✅ Strong (with one race condition to fix)
Test Coverage ✅ Comprehensive
Documentation ✅ Excellent
Code Quality ✅ High (minor DRY violation)

Recommendation: Approve with requested changes

The race condition in cache operations should be fixed before merging, as it could allow rate limit bypass under high concurrency. The DRY violation is a medium priority cleanup item. Otherwise, this is excellent work that significantly improves the security posture of the WebSocket infrastructure.

Great job on this implementation! 🎉


🔧 Required Changes Before Merge

  1. Fix the cache race condition in config/websocket/ratelimits.py:163-181
  2. Extract period name mapping to a utility function to eliminate duplication

🧹 Optional Cleanup

  1. Remove unused RateLimitedConsumerMixin import or use it consistently
  2. Improve test specificity in test_websocket_ratelimits.py:287

@claude
Copy link

claude bot commented Dec 27, 2025

Code Review: WebSocket Rate Limiting Implementation

This is a well-architected PR that successfully implements comprehensive rate limiting for WebSocket connections. The code quality is high and follows the repository's conventions. Here's my detailed review:


✅ Strengths

1. Excellent Architecture & Code Reuse

  • Unified rate limiting infrastructure: The refactoring to create a shared config/ratelimits module is outstanding. This eliminates code duplication between GraphQL and WebSocket rate limiting.
  • Clean separation of concerns: The module structure is logical:
    • config/ratelimits/core.py: Shared utilities
    • config/ratelimits/cache.py: Cache-based implementation
    • config/ratelimits/tiers.py: User tier logic
    • config/ratelimits/config.py: Unified configuration
  • Backward compatibility: Re-exports and delegation patterns ensure existing GraphQL code continues to work.

2. Two-Level Rate Limiting

The implementation correctly applies rate limiting at both levels:

  1. Connection-level (RateLimitMiddleware in config/websocket/middlewares/ratelimit_middleware.py): Rejects excessive connection attempts before consumers are instantiated
  2. Message-level: Each consumer checks rate limits on incoming messages

This is the right approach and prevents both connection flooding and message flooding.

3. Comprehensive Test Coverage

The test suite (opencontractserver/tests/test_websocket_ratelimits.py) is thorough:

  • Unit tests for utilities (TestRateLimitUtilities)
  • Integration tests with actual consumers (WebSocketRateLimitIntegrationTestCase)
  • Connection-level tests (WebSocketConnectionRateLimitTestCase)
  • Tests for disabled state, authenticated vs anonymous users, rate limit exceeded scenarios

4. Proper Async Handling

The check_rate_limit_async function correctly uses database_sync_to_async to wrap synchronous cache operations, making it safe for async WebSocket consumers.

5. Consistent User Tier Multipliers

Superusers get 10x limits (not 5x as mentioned in PR description), authenticated users get 2x, and usage-capped users get 0.5x—consistent with GraphQL tier multipliers.

6. CHANGELOG Documentation

The CHANGELOG entry is detailed and follows the project's format, documenting file locations and line numbers.


🔍 Issues & Recommendations

1. Race Condition in Cache Increment ⚠️ MEDIUM SEVERITY

Location: config/ratelimits/cache.py lines 84-103

Issue: The cache increment logic has a race condition:

# Get current count
current = cache.get(full_key, 0)

is_limited = current >= max_count

if increment and not is_limited:
    if current == 0:
        cache.set(full_key, 1, period_seconds)
    else:
        try:
            cache.incr(full_key)  # Race here!
        except ValueError:
            cache.set(full_key, 1, period_seconds)
    current += 1

Problem: Between cache.get() and cache.incr(), another request could increment the counter. This means:

  • Two concurrent requests could both see current=9 (below limit of 10)
  • Both increment, resulting in current=11 (exceeding limit)
  • Result: Rate limits can be slightly exceeded under high concurrency

Impact:

  • Moderate severity—doesn't break functionality but undermines rate limiting effectiveness
  • Most noticeable with high-concurrency WebSocket connections
  • Could allow burst attacks to exceed limits by ~10-20% under load

Recommendation: Use atomic cache operations or implement a distributed lock pattern. Options:

  1. Use Redis INCR (atomic) if using Redis cache backend
  2. Use cache.add() for atomic test-and-set
  3. Accept the race condition but document it (acceptable for soft rate limiting)

Since this is a soft rate limit (not security-critical), option 3 is reasonable for MVP, but add a comment:

# Note: This has a race condition between get() and incr() that can allow
# slightly exceeding rate limits under high concurrency. This is acceptable
# for soft rate limiting. For stricter enforcement, use atomic Redis operations.

2. Inconsistent Superuser Multiplier in Documentation

Location: PR description and CHANGELOG.md

Issue: Documentation says superusers get 5x limits, but code gives 10x:

  • config/ratelimits/tiers.py:15: "superuser": 10.0
  • PR description: "Superusers get 5x the normal limits"
  • Test at line 136: Expects 300/m (30 * 10)

Recommendation: Update PR description and CHANGELOG to say "10x" instead of "5x".

3. Missing Input Validation in Middleware ⚠️ LOW SEVERITY

Location: config/websocket/middlewares/ratelimit_middleware.py:114

Issue: The parse_rate() function could raise ValueError if rate configuration is invalid, but it's only caught for logging purposes. The except clause catches (ValueError, TypeError) but still sends a close message. If parsing fails, count and period_seconds are undefined.

Current code:

try:
    count, period_seconds = parse_rate(rate)
    reason = f"Rate limit exceeded: {count}/{period_to_name(period_seconds)}"
except (ValueError, TypeError):
    reason = "Rate limit exceeded"

This is actually correct! The except sets a fallback reason. No issue here—good defensive coding.

4. Test File Import Inconsistency

Location: opencontractserver/tests/test_websocket_ratelimits.py:22-28

Issue: Tests import from config.websocket.ratelimits instead of the new shared module config.ratelimits:

from config.websocket.ratelimits import (
    WebSocketRateLimits,
    check_rate_limit,
    get_client_ip_from_scope,
    get_rate_limit_key,
    parse_rate,
)

Analysis: This is fine because config/websocket/ratelimits.py re-exports these functions. However, for consistency, consider importing directly from config.ratelimits in new code.

Recommendation: Optional cleanup (not blocking)—update imports to use the canonical source.

5. Close Code Documentation

Location: config/websocket/middlewares/ratelimit_middleware.py:111

Observation: The PR uses close code 4029 for rate limiting. This is well-documented in the code. Good choice—it's in the 4000-4999 application-defined range.

Recommendation: Consider documenting this in a constants file (e.g., opencontractserver/constants/websocket.py) to avoid magic numbers and ensure consistency across consumers:

# opencontractserver/constants/websocket.py
WEBSOCKET_CLOSE_RATE_LIMITED = 4029

Then use:

from opencontractserver.constants.websocket import WEBSOCKET_CLOSE_RATE_LIMITED
await send({"type": "websocket.close", "code": WEBSOCKET_CLOSE_RATE_LIMITED, ...})

This aligns with the repository's "No magic numbers" principle from CLAUDE.md.

6. Message Type Consistency

Location: Various consumer files (e.g., config/websocket/consumers/document_conversation.py:211)

Issue: Rate limit error responses use msg_type="RATE_LIMITED", but this message type isn't defined in any TypedDict or enum.

Recommendation: Ensure frontend code handles the RATE_LIMITED message type. If there's a MessageType enum or TypedDict, add RATE_LIMITED to it for type safety.


🛡️ Security Considerations

✅ Good Security Practices:

  1. Per-user and per-IP limiting: Correctly implements both strategies
  2. Separate limits for anonymous users: Lower limits for unauthenticated users
  3. Logging: Rate limit violations are logged for monitoring
  4. Fail-safe: RATELIMIT_FAIL_OPEN setting provides control over failure mode

⚠️ Security Notes:

  1. IP Spoofing: The implementation trusts X-Forwarded-For header (config/ratelimits/ip.py:23-27). This is correct for applications behind a reverse proxy, but ensure your reverse proxy (nginx/traefik) is configured to set this header properly and strip client-provided values.

  2. Cache Availability: If cache backend fails and RATELIMIT_FAIL_OPEN=True, rate limiting is bypassed. This is documented, but ensure monitoring alerts on cache failures.


⚡ Performance Considerations

✅ Good Performance Practices:

  1. Cache-based tracking: Using Django's cache backend is efficient
  2. Early rejection: Connection-level middleware rejects before consumer instantiation
  3. Minimal overhead: Rate limit checks are fast cache lookups

💡 Optimization Opportunities:

  1. Cache backend: Ensure production uses Redis or Memcached, not database cache
  2. Cache key prefixing: Using RATELIMIT_KEY_PREFIX allows namespace isolation—good practice

📝 Code Quality

✅ Strengths:

  • DRY: Excellent code reuse and refactoring
  • Type hints: Most functions have proper type hints
  • Docstrings: Comprehensive documentation
  • Error handling: Proper exception handling with fallbacks
  • Logging: Appropriate logging levels (warning for violations)

💡 Minor Suggestions:

  1. Type hints for user: The user parameter in many functions lacks type hints. Consider:

    from typing import Optional, Union
    from django.contrib.auth.models import AbstractBaseUser, AnonymousUser
    
    def get_tier_multiplier(user: Optional[Union[AbstractBaseUser, AnonymousUser]]) -> float:
  2. Return type consistency: RateLimitedConsumerMixin._check_rate_limit could use explicit return type hint.


🧪 Testing

✅ Good Testing Practices:

  • Tests marked with @pytest.mark.serial for WebSocket async compatibility
  • Mocking used appropriately
  • Cache cleared in setUp() methods
  • Integration tests with actual WebSocket communicators

💡 Test Suggestions:

  1. High-concurrency test: Add a test that makes rapid concurrent requests to verify race condition behavior is acceptable
  2. Superuser multiplier test: Verify superuser gets 10x (test at line 129 already does this—good!)
  3. TTL expiration test: Verify that rate limits reset after the time period expires

🎯 Specific File Reviews

config/ratelimits/cache.py

config/websocket/middlewares/ratelimit_middleware.py

  • ✅ Clean middleware implementation
  • ✅ Proper ASGI lifecycle handling
  • 💡 Consider extracting magic number 4029 to constant

config/websocket/ratelimits.py

  • ✅ Good decorator pattern
  • ✅ Mixin provides nice reusability
  • ✅ Backward compatibility exports

config/ratelimits/config.py

  • ✅ Excellent unified configuration
  • ✅ Supports multiple override sources
  • ✅ Lazy loading with caching

opencontractserver/tests/test_websocket_ratelimits.py

  • ✅ Comprehensive test coverage
  • ✅ Good test organization
  • ✅ Proper use of fixtures and mocks

📋 Summary

Recommendation: APPROVE with minor fixes

This is a high-quality PR that successfully implements WebSocket rate limiting with a well-designed architecture. The code is clean, well-tested, and follows repository conventions.

Required before merge:

  1. Fix documentation: Update PR description and CHANGELOG to state superusers get 10x limits (not 5x)

Recommended (non-blocking):

  1. 💡 Add comment documenting race condition in config/ratelimits/cache.py
  2. 💡 Extract WebSocket close code 4029 to a constant
  3. 💡 Add RATE_LIMITED to message type definitions for type safety

Nice to have (post-merge):

  1. 💡 Add high-concurrency test
  2. 💡 Update test imports to use canonical config.ratelimits module

Overall, excellent work! The unified rate limiting infrastructure is a significant improvement that will benefit future development.

@claude
Copy link

claude bot commented Dec 27, 2025

Code Review - WebSocket Rate Limiting Implementation

Overview

This PR successfully implements comprehensive WebSocket rate limiting that mirrors the existing GraphQL rate limiting infrastructure. The implementation is well-architected with good separation of concerns and excellent test coverage.


✅ Strengths

Architecture & Code Quality

  1. Excellent Code Reuse & DRY Principle

    • The refactoring of shared rate limiting logic into config/ratelimits/ is outstanding
    • GraphQL rate limiting was successfully refactored to use the shared infrastructure (config/graphql/ratelimits.py reduced from 224 lines to 66 lines)
    • Single source of truth for rate limits in RateLimits configuration class
    • Consistent tier multipliers (10x for superusers, 2x for authenticated users) across both GraphQL and WebSocket
  2. Well-Organized Module Structure

    • config/ratelimits/__init__.py: Clean public API with all exports documented
    • config/ratelimits/core.py: Pure utility functions (parsing, formatting)
    • config/ratelimits/tiers.py: User tier logic isolated and reusable
    • config/ratelimits/cache.py: Cache-based rate limiting for async contexts
    • config/ratelimits/config.py: Unified configuration with override support
  3. Robust WebSocket Implementation

    • Two-level rate limiting (connection + message) provides defense in depth
    • Middleware at connection level (RateLimitMiddleware) prevents resource exhaustion
    • Per-consumer message limits with RateLimitedConsumerMixin for easy integration
    • Proper use of close code 4029 (custom application code) for rate limit rejections
  4. Test Coverage

    • Comprehensive test suite (388 lines) covering:
      • Utility function edge cases
      • Integration tests with actual WebSocket consumers
      • Connection-level and message-level rate limiting
      • Authenticated vs anonymous users
      • Rate limit bypass when disabled
    • Proper use of @pytest.mark.serial for WebSocket tests (prevents async conflicts with pytest-xdist)
  5. Configuration Flexibility

    • Environment variable overrides for all rate limits
    • Legacy override support for backward compatibility
    • Sensible defaults with clear documentation

⚠️ Issues & Concerns

Critical Issues

1. Race Condition in Cache-Based Rate Limiting ⚠️

Location: config/ratelimits/cache.py:78-99

The cache increment logic has a race condition:

# Get current count
current = cache.get(full_key, 0)

is_limited = current >= max_count

if increment and not is_limited:
    if current == 0:
        cache.set(full_key, 1, period_seconds)
    else:
        try:
            cache.incr(full_key)
        except ValueError:
            cache.set(full_key, 1, period_seconds)

Problem: Between cache.get() and cache.incr(), another request could increment the counter, potentially allowing more requests than the limit.

Impact: Under high concurrency, users could exceed rate limits (e.g., if limit is 10/m, they might get 11-12 requests through).

Recommendation: Use atomic cache operations. Django's cache backend supports cache.add() with TTL for atomic increment:

# Atomic approach using cache.add()
if not cache.add(full_key, 0, period_seconds):
    # Key exists, increment it
    try:
        new_value = cache.incr(full_key)
        is_limited = new_value > max_count
        current = new_value - 1 if increment else new_value
    except ValueError:
        # Key expired between add and incr
        cache.set(full_key, 1, period_seconds)
        current = 1
else:
    # Key didn't exist, we just created it at 0
    if increment:
        cache.incr(full_key)
        current = 1
    else:
        current = 0

2. Superuser Multiplier Inconsistency 🤔

Location: opencontractserver/tests/test_websocket_ratelimits.py:129-136

The test comment says "consistent with GraphQL tier" and expects 10x multiplier for superusers, but there's a discrepancy:

  • Test comment at line 129: "superusers get higher limits (10x, consistent with GraphQL)"
  • Test expectation: 30/m * 10 = 300/m
  • CLAUDE.md states: "Superusers get 5x the normal limits" (in PR description)

Actual behavior: Code uses 10x multiplier (config/ratelimits/tiers.py:14), which is correct for GraphQL consistency.

Recommendation: Update PR description in CHANGELOG.md to say "10x" not "5x" for superusers to match actual implementation.


Medium Priority Issues

3. Environment Variable vs Settings Mismatch

Location: config/settings/ratelimit.py:65-75

The WebSocket rate limit overrides are loaded in WEBSOCKET_RATE_LIMIT_OVERRIDES with os.environ.get() providing default values:

"WS_CONNECT": os.environ.get("RATELIMIT_WS_CONNECT", "30/m"),

This means the defaults ("30/m", "10/m", etc.) are set in TWO places:

  1. config/settings/ratelimit.py:67-74 (environment defaults)
  2. config/ratelimits/config.py:48-55 (class defaults)

Problem: If someone changes one default but not the other, they'll diverge.

Recommendation: Remove the default values from environment variable loading:

WEBSOCKET_RATE_LIMIT_OVERRIDES = {
    "WS_CONNECT": os.environ.get("RATELIMIT_WS_CONNECT"),  # No default
    "WS_CONNECT_ANONYMOUS": os.environ.get("RATELIMIT_WS_CONNECT_ANONYMOUS"),
    # ... etc
}
# Then filter None values (already done at line 78)

This ensures config/ratelimits/config.py is the single source of truth for defaults.

4. Missing Error Handling for Anonymous Users in get_user_tier_rate

Location: config/ratelimits/tiers.py:101-105

The get_user_tier_rate() function is designed for GraphQL (which has info.context.user), but it doesn't account for anonymous users gracefully. If called with an anonymous user, it will still apply tier multipliers based on user = info.context.user.

Potential Issue: The get_tier_multiplier() function correctly handles anonymous users, but the function doesn't document this behavior clearly.

Recommendation: Add a docstring note that this function works for both authenticated and anonymous users, and that anonymous users get 1x multiplier.

5. Inconsistent Error Message Format

Location: Multiple files

The period name lookup for error messages is duplicated in multiple places with different implementations:

  • config/websocket/consumers/unified_agent_conversation.py:243-245: Inline dictionary
  • config/ratelimits/core.py:70-80: period_to_name() function
  • config/websocket/ratelimits.py:145: Uses period_to_name() correctly

Recommendation: Replace the inline dictionary at unified_agent_conversation.py:243-245 with:

from config.ratelimits import period_to_name

count, period = parse_rate(rate)
period_name = period_to_name(period)

This applies to all consumers (document_conversation.py, corpus_conversation.py, standalone_document_conversation.py, thread_updates.py).


Low Priority / Nitpicks

6. Test Coverage Gap

The test suite doesn't cover:

  • Superuser rate limit multipliers (test expects 300/m but doesn't verify it's actually applied)
  • Usage-capped users (0.5x multiplier is implemented but not tested)
  • IPv6 subnet grouping (RATELIMIT_IPV6_MASK setting is unused)

Recommendation: Add tests for tier multipliers and usage-capped users.

7. Magic Number in Middleware

Location: config/websocket/middlewares/ratelimit_middleware.py:125

"reason": reason[:123],  # Close reason max 123 bytes

Recommendation: Extract to a constant:

WS_CLOSE_REASON_MAX_BYTES = 123  # WebSocket spec limit
"reason": reason[:WS_CLOSE_REASON_MAX_BYTES],

8. Changelog Documentation

The CHANGELOG.md is exceptionally detailed (38 lines!), which is great for this feature. However, it could be slightly more concise by removing redundant information (e.g., file paths are mentioned but not linked to specific line numbers as recommended in CLAUDE.md).


🔒 Security Analysis

✅ Security Strengths

  1. IDOR Prevention: Rate limiting is applied per-user (authenticated) or per-IP (anonymous), preventing abuse
  2. DoS Mitigation: Connection-level rate limiting prevents connection exhaustion attacks
  3. Consistent Enforcement: Rate limits cannot be bypassed (except via RATELIMIT_DISABLE setting for testing)
  4. Fail-Closed by Default: When cache is unavailable, RATELIMIT_FAIL_OPEN=False denies requests (secure default)
  5. IP Extraction: Properly handles X-Forwarded-For header for proxied requests (config/ratelimits/ip.py:13-24)

⚠️ Security Considerations

  1. X-Forwarded-For Spoofing: The IP extraction trusts the first IP in X-Forwarded-For, which can be spoofed if the proxy doesn't sanitize headers. This is acceptable for rate limiting (not authentication), but should be documented.

  2. Cache Poisoning: If the cache backend is compromised, rate limits could be bypassed. This is a known risk with cache-based rate limiting and is acceptable.

  3. Rate Limit Enumeration: An attacker could intentionally trigger rate limits to determine user IDs or IP addresses in use. This is low-risk but worth noting.


📊 Performance Considerations

✅ Performance Strengths

  1. Efficient Cache Usage: Single cache lookup per request (not N+1)
  2. Minimal Overhead: Rate limit checks are O(1) operations
  3. No Database Queries: All rate limiting uses cache backend (fast)

💡 Performance Suggestions

  1. Cache Backend Choice: Consider using Redis for production (RATELIMIT_USE_CACHE = "default" should point to Redis, not database cache)
  2. Cache Key Prefix: RATELIMIT_KEY_PREFIX = "rl" could be more specific to avoid collisions (e.g., "ratelimit:v1")

🧪 Testing Recommendations

Additional Tests Needed

  1. Concurrent Request Test: Verify the race condition behavior under high concurrency
  2. Tier Multiplier Tests: Verify superuser, authenticated, and usage-capped multipliers
  3. Cache Failure Test: Verify RATELIMIT_FAIL_OPEN behavior
  4. IPv6 Test: Verify IPv6 subnet grouping (if implemented)
  5. WebSocket Close Code Test: Verify close code 4029 is properly sent to clients

📝 Documentation

What's Good

  • CHANGELOG.md is extremely detailed
  • Inline docstrings are clear and comprehensive
  • Code is self-documenting with good variable names

Suggestions

  1. Add Rate Limit Documentation: Create docs/ratelimiting/README.md explaining:

    • How to configure rate limits
    • User tier multipliers
    • WebSocket vs GraphQL rate limits
    • Troubleshooting rate limit issues
  2. Update CLAUDE.md: Add WebSocket rate limiting to the "Critical Concepts" section


🎯 Recommendations Summary

Must Fix Before Merge

  1. Fix race condition in cache.py (critical for correctness under concurrency)
  2. Fix superuser multiplier documentation (5x vs 10x discrepancy)
  3. Remove duplicate defaults (environment variables vs config class)

Should Fix Before Merge

  1. Consolidate error message formatting (use period_to_name() everywhere)
  2. Add tier multiplier tests (verify 10x, 2x, 0.5x multipliers work)

Nice to Have

  1. 📝 Extract magic number (WS_CLOSE_REASON_MAX_BYTES constant)
  2. 📝 Add rate limiting documentation (docs/ratelimiting/)
  3. 📝 Update CLAUDE.md with WebSocket rate limiting patterns

Final Verdict

Overall: Excellent implementation with minor issues

This PR demonstrates strong software engineering:

  • Excellent code reuse and DRY principles
  • Well-organized module structure
  • Comprehensive test coverage
  • Security-conscious implementation

The critical race condition in cache.py should be addressed before merge, but it's a relatively easy fix. The other issues are minor and can be addressed in follow-up PRs if needed.

Recommendation: Request changes for the race condition fix, then approve once resolved.


Great work on this feature! The unified rate limiting architecture will make it much easier to maintain and extend rate limiting across the application. 🚀

@claude
Copy link

claude bot commented Dec 28, 2025

Code Review - PR #731: Implement WebSocket Rate Limiting

Summary

This PR implements comprehensive WebSocket rate limiting that mirrors the existing GraphQL rate limiting infrastructure. The implementation is well-architected with a shared rate limiting module and appropriate separation of concerns.


✅ Strengths

1. Excellent Architecture & Code Reuse

  • Successfully refactored GraphQL rate limiting (config/graphql/ratelimits.py) to extract shared utilities into config/ratelimits/ package
  • Clean separation: core utilities, configuration, tier logic, IP extraction, and cache-based rate limiting
  • Eliminated ~160 lines of code duplication between GraphQL and WebSocket implementations
  • Follows DRY principle effectively

2. Security Best Practices

  • Multi-level protection: Connection-level (middleware) + message-level (consumer) rate limiting
  • Fail-safe behavior: Uses RATELIMIT_FAIL_OPEN setting with proper error handling (lines 128-135 in cache.py)
  • Atomic operations: Cache operations use cache.add() + cache.incr() pattern to prevent race conditions (cache.py:89-104)
  • Proper user tier multipliers: Consistent with GraphQL (superuser: 10x, authenticated: 2x, anonymous: 1x)

3. WebSocket-Specific Features

  • Custom close code 4029: Appropriate use of application-defined range (4000-4999) for rate limit rejections
  • Close reason truncation: Respects WebSocket spec's 123-byte limit (core.py:8, ratelimit_middleware.py:125)
  • User-friendly error messages: Returns RATE_LIMITED message type with retry information to clients

4. Test Coverage

  • Comprehensive test suite (test_websocket_ratelimits.py, 484 lines):
    • Unit tests for utilities (parsing, IP extraction, key generation)
    • Integration tests with actual WebSocket consumers
    • Tests for fail-open/fail-closed behavior
    • Tests for tier multipliers
  • Proper async handling: Uses @pytest.mark.serial to avoid event loop conflicts with pytest-xdist

5. Documentation

  • Excellent CHANGELOG.md update with file locations and line numbers
  • Clear docstrings throughout
  • Proper migration path with re-exports for backward compatibility

🔍 Issues & Concerns

CRITICAL - Race Condition in Cache Operations

File: config/ratelimits/cache.py:89-104

The atomic increment pattern has a subtle issue:

if increment:
    cache.add(full_key, 0, period_seconds)  # Returns True if created, False if exists
    
    try:
        current = cache.incr(full_key)
    except ValueError:
        # Key expired between add and incr (rare race condition)
        cache.set(full_key, 1, period_seconds)
        current = 1
    
    is_limited = current > max_count  # ❌ ISSUE: Should be >= for first request

Problem: Line 104 checks current > max_count, which means:

  • For a limit of "10/m", the 11th request is blocked
  • But the code allows 11 requests before blocking (since 11 > 10)
  • Should be current > max_count for increment=True, but current >= max_count for increment=False

Impact: Users get one extra request per time period than configured

Fix:

# Line 104 should be:
is_limited = current > max_count  # Correct for increment=True
# But line 108 should remain:
is_limited = current >= max_count  # Correct for increment=False

Actually, looking more closely, the current logic IS correct:

  • With increment=True: After incrementing, if count is 11 and limit is 10, we block (11 > 10) ✅
  • With increment=False: Before incrementing, if count is 10 and limit is 10, we block (10 >= 10) ✅

RETRACTED - The logic is actually correct as-is.


Medium - Inconsistent Tier Multiplier in Comment

File: config/websocket/middlewares/ratelimit_middleware.py:75 (referenced in PR description)

PR Description claims: "Superusers get 5x the normal limits"

Actual implementation (config/ratelimits/tiers.py:14-16):

TIER_MULTIPLIERS = {
    "superuser": 10.0,  # Superusers get 10x the base limit
    "authenticated": 2.0,
    "anonymous": 1.0,
}

Issue: Documentation mismatch. The code correctly implements 10x for superusers (consistent with GraphQL), but the PR description says 5x.

Fix: Update PR description to reflect 10x multiplier for superusers.


Medium - Missing Error Context in Logs

File: config/ratelimits/cache.py:131-134

except Exception as e:
    fail_open = getattr(settings, "RATELIMIT_FAIL_OPEN", False)
    if fail_open:
        logger.error(f"Rate limit cache error (failing open): {e}", exc_info=True)
        return False, {"limit": 0, "remaining": 0, "reset_time": 0}

Issue: When the cache fails, logs don't include the rate limit key or group, making debugging harder.

Suggestion:

logger.error(
    f"Rate limit cache error (failing open): {e} - Group: {group}, Key: {key}",
    exc_info=True
)

Low - Potential Confusion with Rate Limit Remaining

File: config/ratelimits/cache.py:110

remaining = max(0, max_count - current)

Issue: When rate limited, remaining is 0, but the user has already exceeded the limit. The current value might be 11 when max_count is 10, giving remaining = -1 → 0.

Clarification needed: Is this intentional? It's correct behavior, but might be worth a comment explaining that remaining is clamped to 0 when over the limit.


Low - Magic Number in Tier Multipliers

File: opencontractserver/tests/test_websocket_ratelimits.py:136

assert rate == "300/m"

Issue: Test hardcodes expected value instead of calculating from constants:

# Better:
base_rate_count, period = parse_rate("30/m")
expected_count = int(base_rate_count * TIER_MULTIPLIERS["superuser"])
assert rate == f"{expected_count}/{period}"

This makes tests resilient to changes in tier multipliers.


🎯 Suggestions for Improvement

1. Add Circuit Breaker Pattern (Future Enhancement)

For production resilience, consider adding a circuit breaker for cache failures:

# If cache has failed N times in M seconds, skip rate limiting entirely
# to prevent cascade failures

This is especially important if using Redis and it becomes unavailable.


2. Add Metrics/Monitoring Hooks

Consider adding hooks for metrics collection:

# In cache.py:119-123
if is_limited:
    logger.warning(...)
    # ADD: Signal for monitoring (e.g., increment rate_limit_exceeded counter)
    #      This helps track abuse patterns

3. Consider Adding Rate Limit Headers to WebSocket Messages

GraphQL sets X-RateLimit-Limit and X-RateLimit-Remaining headers. Consider adding similar info to WebSocket responses:

await self.send_standard_message(
    msg_type="RESPONSE",
    content=result,
    data={
        "rate_limit": {
            "limit": info.get("limit"),
            "remaining": info.get("remaining"),
            "reset_after": info.get("reset_time"),
        }
    }
)

This helps clients implement proper backoff strategies.


4. Document Rate Limit Keys in Admin Interface

Consider adding admin commands to:

  • View current rate limit state for a user/IP
  • Reset rate limits for debugging
# Example:
python manage.py rate_limit_status --user 42
python manage.py rate_limit_reset --ip 192.168.1.1

📝 Minor Nitpicks

1. Unused Import in Tests

File: opencontractserver/tests/test_websocket_ratelimits.py:22

from config.ratelimits.tiers import TIER_MULTIPLIERS, get_tier_multiplier

TIER_MULTIPLIERS is used, but verify if all imports are necessary.

2. Inconsistent Naming

  • check_rate_limit() vs check_rate_limit_async() - Good naming
  • But RateLimitMiddleware() is a factory function, not a class (line 130 in ratelimit_middleware.py)
    • Consider naming it rate_limit_middleware_factory() or keeping RateLimitMiddleware as the class and using it directly

3. Test Timeout Values

File: opencontractserver/tests/test_websocket_ratelimits.py:409

response = await communicator.receive_from(timeout=5)

5-second timeout might be too long for unit tests. Consider reducing to 1-2 seconds.


✅ Security Review

No security vulnerabilities found. The implementation:

  • ✅ Properly validates rate limit strings
  • ✅ Uses atomic cache operations
  • ✅ Handles cache failures gracefully
  • ✅ Logs rate limit violations for monitoring
  • ✅ Prevents enumeration via consistent error messages
  • ✅ Respects RATELIMIT_DISABLE setting for testing

📊 Performance Considerations

Positive:

  • ✅ Uses cache (Redis/Memcached) for distributed rate limiting
  • ✅ Atomic operations prevent excessive cache round-trips
  • ✅ Middleware rejects connections early (before consumer instantiation)

Potential concerns:

  • ⚠️ Every WebSocket message triggers 2 cache operations (add + incr or get)
  • ⚠️ High-frequency WebSocket messages could create cache hotspots

Recommendation: Monitor cache performance in production. If cache becomes a bottleneck, consider:

  • Using local in-memory rate limiting for very high-frequency operations
  • Implementing a sliding window algorithm for more precise rate limiting

🎓 Alignment with CLAUDE.md

Follows all baseline commit rules:

  • Tests are comprehensive
  • Code compiles (TypeScript N/A for backend changes)
  • No Claude attribution in commits

Follows critical concepts:

  • No dead code (refactored existing code)
  • DRY principle applied (shared config.ratelimits module)
  • Single Responsibility Principle (separate modules for cache, tiers, config, etc.)
  • Uses constants (TIER_MULTIPLIERS, PERIOD_SECONDS, etc.)

Proper documentation:

  • CHANGELOG.md updated with file locations and line numbers
  • Docstrings throughout

🚀 Recommendation

APPROVE with minor suggestions

This is a well-architected PR that successfully implements WebSocket rate limiting with proper security, testing, and documentation. The refactoring to create a shared rate limiting infrastructure is a significant improvement.

Before merging:

  1. ✅ Fix PR description to reflect 10x (not 5x) multiplier for superusers
  2. ⚠️ Consider adding error context to cache failure logs
  3. ⚠️ Consider adding rate limit info to WebSocket message responses

Post-merge:

  • Monitor cache performance under load
  • Consider adding admin commands for rate limit management
  • Consider adding metrics hooks for monitoring

Great work! 🎉

@codecov
Copy link

codecov bot commented Dec 28, 2025

Codecov Report

❌ Patch coverage is 97.30942% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...ncontractserver/tests/test_websocket_ratelimits.py 97.24% 6 Missing ⚠️

📢 Thoughts on this report? Let us know!

@JSv4 JSv4 mentioned this pull request Dec 29, 2025
claude and others added 5 commits December 30, 2025 13:21
Implements per-user and per-IP rate limiting for WebSocket connections,
mirroring the existing GraphQL rate limiting infrastructure.

Changes:
- Add new rate limiting module (config/websocket/ratelimits.py) with
  check_rate_limit functions, WebSocketRateLimits config class, and
  RateLimitedConsumerMixin
- Add rate limiting middleware for connection-level limits that rejects
  excessive connection attempts with close code 4029
- Add message-level rate limiting to all WebSocket consumers:
  UnifiedAgentConsumer, DocumentQueryConsumer, CorpusQueryConsumer,
  StandaloneDocumentQueryConsumer, and ThreadUpdatesConsumer
- Add configurable rate limit settings via environment variables
- Integrate rate limiting middleware into ASGI stack after authentication
- Add comprehensive test suite for rate limiting utilities and integration

Rate limits:
- Authenticated users: 30 connections/min, 60 messages/min, 20 AI queries/min
- Anonymous users: 10 connections/min, 20 messages/min, 5 AI queries/min
- Superusers get 5x the normal limits

Closes #730
… module

Extract common rate limiting infrastructure into config/ratelimits/:
- core.py: parse_rate(), format_rate_limit_message(), period constants
- config.py: unified RateLimits singleton for both GraphQL and WebSocket
- tiers.py: user tier multiplier logic (superuser 10x, auth 2x, anon 1x)
- ip.py: IP extraction utilities for both request and scope contexts
- cache.py: cache-based rate limiting for WebSocket consumers

Key improvements:
- Eliminates ~400 lines of duplicate code
- Standardizes superuser multiplier to 10x (was 5x for WebSocket)
- Single source of truth for rate limit configuration
- Maintains full backward compatibility with existing decorators

All existing usages continue to work unchanged:
- @graphql_ratelimit(rate=RateLimits.WRITE_MEDIUM)
- @graphql_ratelimit_dynamic(get_rate=get_user_tier_rate("READ_MEDIUM"))
- WebSocketRateLimits.get_rate_for_user()
- Remove unused Tuple import from cache.py and core.py
- Update type hints to use built-in tuple[] syntax
- Fix import ordering in __init__.py (isort)
- Fix line formatting (black)
Changes:
- Fix race condition in cache-based rate limiting by using atomic
  cache.add() + cache.incr() operations instead of get-then-set
- Fix test failure by adding @override_settings(RATELIMIT_DISABLE=False)
  since test settings disable rate limiting by default
- Update CHANGELOG.md to correctly state superusers get 10x (not 5x)
  rate limit multiplier, consistent with GraphQL tier
- Remove default values from environment variable loading in
  config/settings/ratelimit.py - config/ratelimits/config.py is now
  the single source of truth for default values
- Replace inline period_name dictionaries in all 5 consumers with
  shared period_to_name() function for consistency
- Extract magic number 123 (WebSocket close reason max bytes) to
  WS_CLOSE_REASON_MAX_BYTES constant in config/ratelimits/core.py
- Add missing test coverage for:
  - User tier multipliers (anonymous, authenticated, superuser, usage-capped)
  - Cache failure behavior (RATELIMIT_FAIL_OPEN vs closed)
…miting

Add documentation note to get_user_tier_rate() explaining that the function
works correctly for both authenticated and anonymous users, with anonymous
users receiving a 1x multiplier.
@JSv4 JSv4 force-pushed the claude/websocket-rate-limiting-uvx47 branch from 82d78e6 to b70c0fd Compare December 30, 2025 18:33
@claude
Copy link

claude bot commented Dec 30, 2025

Code Review: WebSocket Rate Limiting (PR #731)

Overall, this is an excellent implementation that brings robust rate limiting to WebSocket connections. The code is well-architected, thoroughly tested, and follows the project's conventions. Here's my detailed review:


✅ Strengths

1. Architecture & Code Organization

  • Outstanding refactoring: The extraction of shared rate limiting logic into config/ratelimits/ is exemplary. This eliminates duplication between GraphQL and WebSocket implementations.
  • Clean separation of concerns: Core utilities (core.py), cache operations (cache.py), IP extraction (ip.py), tier logic (tiers.py), and config (config.py) are properly separated.
  • Consistent patterns: The WebSocket rate limiting mirrors the GraphQL implementation, making the codebase more maintainable.
  • Proper layering: Two-level rate limiting (connection + message) is the right approach for WebSocket security.

2. Security

  • Excellent IDOR prevention: Per-user and per-IP rate limiting prevents enumeration attacks.
  • Proper tier multipliers: Superusers (10x), authenticated users (2x), anonymous users (1x) are appropriate and consistent with GraphQL.
  • Fail-safe defaults: RATELIMIT_FAIL_OPEN defaults to False, meaning the system fails closed on cache errors (secure by default).
  • Custom close code: Using WebSocket close code 4029 for rate limiting is a good practice (4000-4999 is the application range).
  • Truncated close reasons: Close reasons are limited to WS_CLOSE_REASON_MAX_BYTES (123 bytes) per WebSocket spec.

3. Race Condition Handling

The atomic cache operations in config/ratelimits/cache.py:84-104 are well-designed:

# 1. Atomically create key if it doesn't exist
cache.add(full_key, 0, period_seconds)
# 2. Atomically increment
current = cache.incr(full_key)
# 3. Check AFTER incrementing
is_limited = current > max_count

However, there's a minor edge case (see Issues section below).

4. Test Coverage

  • Comprehensive unit tests: Utility functions, rate parsing, IP extraction, tier multipliers all tested.
  • Integration tests: Real WebSocket consumer tests with mocked agents.
  • Edge cases covered: Fail-open, fail-closed, cache errors, anonymous vs authenticated users.
  • Proper isolation: Tests use @pytest.mark.serial for async WebSocket tests and clear cache between runs.

5. Documentation

  • Excellent changelog: Detailed with file locations and line numbers (follows project guidelines).
  • Clear docstrings: Every function and class has comprehensive documentation.
  • Inline comments: Complex logic (like atomic operations) is well-explained.

⚠️ Issues & Suggestions

1. Minor Race Condition in Cache Operations (Low Severity)

Location: config/ratelimits/cache.py:89-101

Issue: There's a small race condition between cache.add() and cache.incr():

cache.add(full_key, 0, period_seconds)  # Create with 0
try:
    current = cache.incr(full_key)  # Increment to 1
except ValueError:
    # Key expired between add and incr
    cache.set(full_key, 1, period_seconds)
    current = 1

If the key expires exactly between add() and incr() (extremely rare but possible), you catch it with the ValueError handler. However, you then use cache.set() which is not atomic - another request could have created the key in the meantime, and you'd overwrite their count.

Suggested Fix:

except ValueError:
    # Key expired - try to atomically create it again with value 1
    if not cache.add(full_key, 1, period_seconds):
        # Another request created it - try incrementing again
        try:
            current = cache.incr(full_key)
        except ValueError:
            # Give up after retry, default to allowing request
            current = 1
    else:
        current = 1

This is a very minor issue since the race window is nanoseconds and the worst case is allowing one extra request.

2. Inconsistent Error Checking Logic (Low Severity)

Locations:

  • config/ratelimits/cache.py:104 vs 108

Issue: When increment=True, you check current > max_count (line 104), but when increment=False, you check current >= max_count (line 108). This is inconsistent.

Rationale: When incrementing, you want to block requests that exceed the limit (request #11 when limit is 10). When not incrementing (just checking), you should use the same logic for consistency.

Suggested Fix: Use current > max_count in both cases, or document why they differ.

3. Missing Type Hints in Some Functions (Very Low Severity)

Location: config/ratelimits/config.py

Issue: The _RateLimits class methods like get_ws_rate_for_user() are missing type hints.

Suggestion: Add type hints for better IDE support and type checking:

def get_ws_rate_for_user(self, rate_type: str, user) -> str:
    # ... (should be user: Optional[User])

4. Test Comment Mismatch (Documentation)

Location: opencontractserver/tests/test_websocket_ratelimits.py:136

Issue: Comment says "superuser gets 10x = 300/m (unified with GraphQL tier)" but GraphQL comment in config/graphql/ratelimits.py said superusers get 10x the limit. The test is correct (10x multiplier is consistent), but consider adding a constant for the multiplier value.

Suggestion: Define multipliers as constants in config/ratelimits/tiers.py (which you already do with TIER_MULTIPLIERS!) and reference them in tests:

assert multiplier == TIER_MULTIPLIERS["superuser"]

(You already do this in test_superuser_multiplier, so just make sure all tests are consistent.)

5. Potential Confusion: WS_CLOSE_REASON_MAX_BYTES (Documentation)

Location: config/ratelimits/core.py

Issue: The constant WS_CLOSE_REASON_MAX_BYTES = 123 is defined but not well-documented as to why 123 bytes.

Suggestion: Add a comment explaining this is the WebSocket protocol limit per RFC 6455.


🎯 Best Practices Observed

  1. Follows CLAUDE.md:

    • Uses pytest -n 4 --dist loadscope for parallel testing
    • Marks WebSocket tests as @pytest.mark.serial
    • Clears cache in setUp() methods
    • No dead code left behind
  2. DRY Principle:

    • Shared config/ratelimits/ module eliminates duplication
    • GraphQL decorators now delegate to shared utilities
  3. Security Best Practices:

    • Per-user rate limiting prevents IDOR
    • IP-based rate limiting for anonymous users
    • Proper logging of rate limit violations
    • Fails closed on errors (secure by default)
  4. Error Handling:

    • Graceful handling of cache failures
    • Proper exception handling in async contexts
    • Informative error messages to clients

📊 Performance Considerations

  1. Atomic cache operations prevent race conditions
  2. Synchronous cache check in middleware is appropriate (happens once per connection)
  3. Async wrapper (check_rate_limit_async) properly uses database_sync_to_async for message-level checks
  4. ⚠️ Cache backend: Ensure production uses Redis or Memcached for RATELIMIT_USE_CACHE, not the database backend (which would be slow)

🧪 Test Coverage Assessment

Coverage: Excellent (90%+)

✅ Covered:

  • Rate parsing (valid/invalid formats)
  • IP extraction (with/without X-Forwarded-For)
  • Key generation (authenticated/anonymous users)
  • Rate limit enforcement (under/over limit)
  • Tier multipliers (anonymous/authenticated/superuser/usage-capped)
  • Fail-open and fail-closed behavior
  • WebSocket integration (connection rejection, message rate limiting)

⚠️ Missing (minor):

  • Test for when X-Forwarded-For contains multiple IPs with spaces
  • Test for when cache backend is completely unavailable (not just throwing exceptions)
  • Load test to verify atomic operations under high concurrency (would require pytest-xdist stress testing)

🔒 Security Assessment

Security: Strong

✅ Strengths:

  • Per-user and per-IP rate limiting
  • Superuser tier prevents accidental lockout of admins
  • Custom close code (4029) doesn't leak implementation details
  • Logging includes IP, user ID, rate, and path for security monitoring
  • Anonymous users get lower limits (10/m vs 30/m for connections)

⚠️ Considerations:

  • IP spoofing: X-Forwarded-For can be spoofed if not behind a trusted proxy. Ensure your deployment uses a trusted reverse proxy (nginx/CloudFlare) that sets this header.
  • Distributed rate limiting: If you have multiple app servers, ensure they share the same cache backend (Redis) for accurate rate limiting across servers.

📝 Recommendations

Must Fix (Before Merge):

None - the code is production-ready.

Should Consider (Future Improvements):

  1. Add cache backend validation in settings to ensure it's not using database backend for rate limiting
  2. Document deployment requirements (trusted proxy for X-Forwarded-For)
  3. Add metrics/monitoring for rate limit hits (consider adding Prometheus metrics)
  4. Consider adding X-RateLimit headers to WebSocket messages (similar to GraphQL's response headers)

Nice to Have:

  1. Admin dashboard to view rate limit violations
  2. Per-user override capability (allow specific users to have custom limits)
  3. Automatic IP blocklist after repeated violations

✅ Final Verdict

APPROVE with minor suggestions

This PR is excellent work and ready to merge. The code quality is high, test coverage is comprehensive, and the implementation follows all project conventions. The minor issues identified are edge cases that don't affect normal operation.

The refactoring to share rate limiting infrastructure between GraphQL and WebSocket is a significant architectural improvement that will make future maintenance much easier.

Impact: Closes #730 and significantly improves the security posture of the WebSocket endpoints.

Great job! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Websocket Rate Limiting

3 participants