Skip to content

feat/rate limit implementation#742

Open
sujal12344 wants to merge 7 commits intoVoltAgent:mainfrom
sujal12344:feat/rate-limit-implementation
Open

feat/rate limit implementation#742
sujal12344 wants to merge 7 commits intoVoltAgent:mainfrom
sujal12344:feat/rate-limit-implementation

Conversation

@sujal12344
Copy link
Copy Markdown
Contributor

@sujal12344 sujal12344 commented Oct 25, 2025

PR Checklist

Please check if your PR fulfills the following requirements:

Bugs / Features

What is the current behavior?

Currently, VoltAgent has no built-in mechanism to control the frequency of LLM calls and tool executions. This can lead to:

  • Exceeding API rate limits from LLM providers (OpenAI, Anthropic, Google, etc.)
  • Unexpected API throttling or account suspension
  • Uncontrolled costs in production environments
  • Poor resource management in multi-tenant scenarios
  • No way to implement custom usage quotas

Users must implement rate limiting manually in their application code, which is:

  • Error-prone and inconsistent across different agents
  • Difficult to test and maintain
  • Not integrated with VoltAgent's observability features

What is the new behavior?

  1. Configurable Rate Limiting

    • Control LLM calls globally or per provider
    • Limit tool executions individually
    • Support for requests-per-minute constraints
  2. Flexible Strategies

    • Fixed Window Counter (MVP)
    • Option to throw error immediately or delay until quota resets
  3. Multi-Scope Support

    • Global LLM limits
    • Provider-specific limits (OpenAI, Anthropic, etc.)
    • Tool-specific limits

Example Project (examples/with-rate-limiting/)

  • 6 working examples demonstrating all features:
    1. Basic LLM rate limiting (throw strategy)
    2. Delay strategy (auto-wait behavior)
    3. Provider-specific limits
    4. Tool-specific limits
    5. Combined LLM + tool limits
    6. Monitoring and statistics
  • Ready-to-run with Google Gemini API

fixes (issue) #5

Screenshots with running project with rate limiting

WhatsApp Image 2025-10-25 at 14 56 35_372aa8ec

WhatsApp Image 2025-10-25 at 15 01 11_19a5271c

WhatsApp Image 2025-10-25 at 15 35 20_a339b82a

WhatsApp Image 2025-10-25 at 15 44 20_00911578

WhatsApp Image 2025-10-25 at 15 46 36_d414a9c6

WhatsApp Image 2025-10-25 at 15 47 45_d80b69db

Summary by CodeRabbit

  • New Features

    • Configurable rate limiting for LLM requests with optional per-tool quotas, enforcement modes (delay or throw), runtime stats, reset support, and a new RateLimitExceeded error type.
  • Documentation

    • Added a complete, runnable rate-limiting example and README, plus an environment example for the API key.
  • Chores

    • New example package scaffold with scripts and ignore rules for the demo.
  • Tests

    • Comprehensive tests covering limiter behavior, manager logic, and edge cases.

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Oct 25, 2025

🦋 Changeset detected

Latest commit: 1edd8d3

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
with-rate-limiting Minor
@voltagent/core Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@sujal12344 sujal12344 changed the title Feat/rate limit implementation feat/rate limit implementation Oct 25, 2025
@sujal12344
Copy link
Copy Markdown
Contributor Author

sujal12344 commented Oct 25, 2025

Hey @omeraplak @marinoska @necatiozmen , I’ve submitted a PR — #742
Could you please review it and share your feedback when you get a chance?

@sujal12344 sujal12344 mentioned this pull request Oct 25, 2025
@omeraplak
Copy link
Copy Markdown
Member

Hey @sujal12344 ,
Thank you! I'll review it soon

@omeraplak
Copy link
Copy Markdown
Member

Hey @sujal12344 , thanks a lot for the PR! 🔥

I have a couple of questions:

  1. Since we define rateLimits on the Agents and each agent has only one model, how does the providers field under rateLimits work?
rateLimits: {
  providers: {
    openai: {
      maxRequestsPerMinute: 5,
      onExceeded: "throw"
    },
    anthropic: {
      maxRequestsPerMinute: 3,
      onExceeded: "delay"
    }
  }
}

Is this meant for future support when dynamic providers are added?

  1. When defining rate limits for tools, is there a way to add IntelliSense support? That way, tool names could be resolved dynamically.
  2. It would be nice if the user could be notified via a hook when a rate limit is reached.
    Right now, onExceeded returns a decision, but maybe there could also be a hook with the same name?
  3. What do you think about defining a global rate limit through the new VoltAgent() instance itself?

Copilot AI review requested due to automatic review settings April 1, 2026 14:28
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 1, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b6535356-093b-43ad-b86c-7724079a9a33

📥 Commits

Reviewing files that changed from the base of the PR and between a2682f9 and 1edd8d3.

📒 Files selected for processing (1)
  • .changeset/young-rice-study.md
✅ Files skipped from review due to trivial changes (1)
  • .changeset/young-rice-study.md

📝 Walkthrough

Walkthrough

Adds a new rate-limiting subsystem (types, errors, fixed-window limiter, manager), integrates LLM and per-tool runtime checks into Agent generate/stream and tool execution, adds tests and a runnable example demonstrating configurations and behaviors.

Changes

Cohort / File(s) Summary
Rate limit public API & entrypoints
packages/core/src/rate-limit/index.ts, packages/core/src/index.ts
New rate-limit module entrypoint and re-exports; core package exposes limiter, manager, error, and rate-limit types.
Types & Error
packages/core/src/rate-limit/types.ts, packages/core/src/rate-limit/errors.ts
Adds public rate-limit types (strategies, actions, scopes, stats, configs) and RateLimitExceededError with stats, scope, reset, retryAfter, and JSON serialization.
Limiter implementation & tests
packages/core/src/rate-limit/limiters/fixed-window.ts, packages/core/src/rate-limit/limiters/fixed-window.spec.ts, packages/core/src/rate-limit/limiters/index.ts
Adds FixedWindowCounterLimiter (acquire/check/getStats/reset) with throw/delay behaviors, delay handling, and comprehensive Vitest coverage; re-exports limiter.
Manager implementation & tests
packages/core/src/rate-limit/manager.ts, packages/core/src/rate-limit/manager.spec.ts
Adds RateLimitManager for lazy creation/caching of limiters, LLM/tool checks, stats aggregation and reset; tests cover enforcement, delay-mode, caching, and reset.
Agent integration & types
packages/core/src/agent/agent.ts, packages/core/src/agent/types.ts
Adds optional rateLimits?: AgentRateLimitConfig; Agent constructs rateLimitManager when configured and gates generateText/streamText and tool execution with LLM/tool checks; helper to extract provider info added.
Examples & packaging
examples/with-rate-limiting/.env.example, examples/with-rate-limiting/.gitignore, examples/with-rate-limiting/README.md, examples/with-rate-limiting/package.json, examples/with-rate-limiting/src/index.ts
New example project demonstrating LLM and tool rate-limiting scenarios (six examples), README, env example, package metadata, and run scripts.
Repository metadata
.changeset/young-rice-study.md
Adds changeset entry noting rate-limiting implementation and minor version bump for affected packages.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Agent as Agent
    participant RateLimitManager as RateLimitManager
    participant FixedWindowLimiter as FixedWindowLimiter
    participant LLMSDK as LLM SDK

    Client->>Agent: generateText(prompt)
    Agent->>RateLimitManager: checkLLMRateLimit()
    RateLimitManager->>FixedWindowLimiter: acquire()

    alt Limit Not Exceeded
        FixedWindowLimiter-->>RateLimitManager: acquired
        RateLimitManager-->>Agent: ok
        Agent->>LLMSDK: call LLM
        LLMSDK-->>Agent: response
        Agent-->>Client: result
    else Limit Exceeded (throw)
        FixedWindowLimiter-->>RateLimitManager: throw RateLimitExceededError
        RateLimitManager-->>Agent: propagate error
        Agent-->>Client: error (stats, resetAt, retryAfter)
    else Limit Exceeded (delay)
        FixedWindowLimiter->>FixedWindowLimiter: await window reset
        FixedWindowLimiter-->>RateLimitManager: acquired after reset
        RateLimitManager-->>Agent: ok
        Agent->>LLMSDK: call LLM
        LLMSDK-->>Agent: response
        Agent-->>Client: result
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Poem

🐰
I nibble counters, watch the clock,
Windows tick, then softly lock.
Delay or throw, I keep the pace,
Tools and LLMs share the space.
Hop along — a measured race!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat/rate limit implementation' directly matches the main changeset purpose—implementing a rate limiting system for VoltAgent with configurable LLM and tool-level limits.
Description check ✅ Passed The PR description covers all key template sections: checklist items completed (message, issue link, tests, docs), current and new behavior documented, examples provided, and addresses the feature's purpose comprehensively.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new rate-limiting module to VoltAgent core and wires it into the agent execution flow to control LLM call frequency and tool execution frequency, with accompanying tests and a runnable example project.

Changes:

  • Introduces RateLimitManager, FixedWindowCounterLimiter, and RateLimitExceededError plus public types/exports.
  • Integrates rate-limit checks into Agent.generateText, streaming, and tool execution paths via AgentOptions.rateLimits.
  • Adds a new examples/with-rate-limiting project plus unit tests for the manager and limiter.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
packages/core/src/rate-limit/types.ts Defines rate-limiting configuration/types and core RateLimiter interface.
packages/core/src/rate-limit/manager.ts Implements RateLimitManager to create/cache limiters and enforce LLM/tool limits.
packages/core/src/rate-limit/manager.spec.ts Unit tests for manager behavior (LLM/tool scopes, delay/throw, reset/stats).
packages/core/src/rate-limit/limiters/index.ts Barrel export for limiter implementations.
packages/core/src/rate-limit/limiters/fixed-window.ts Implements fixed-window counter limiter with delay/throw behaviors.
packages/core/src/rate-limit/limiters/fixed-window.spec.ts Unit tests for the fixed-window limiter.
packages/core/src/rate-limit/index.ts Public exports for the rate-limit submodule.
packages/core/src/rate-limit/errors.ts Defines RateLimitExceededError including stats and retry-after metadata.
packages/core/src/index.ts Re-exports rate-limiting APIs from the package root.
packages/core/src/agent/types.ts Adds rateLimits?: AgentRateLimitConfig to AgentOptions.
packages/core/src/agent/agent.ts Enforces rate limits before LLM calls and tool executions.
examples/with-rate-limiting/src/index.ts Demonstrates LLM/tool limits and delay/throw strategies.
examples/with-rate-limiting/README.md Documents how to run/configure the example and rate limiting.
examples/with-rate-limiting/package.json Adds example app package definition and dependencies.
examples/with-rate-limiting/.gitignore Ignores build artifacts and local state for the example.
examples/with-rate-limiting/.env.example Provides required environment variable for Gemini API.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 issues found across 16 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="examples/with-rate-limiting/src/index.ts">

<violation number="1" location="examples/with-rate-limiting/src/index.ts:67">
P2: `retryAfter` is computed in milliseconds (`resetAt.getTime() - Date.now()`), but this line labels it as "seconds". Divide by 1000 or change the label to "milliseconds".</violation>

<violation number="2" location="examples/with-rate-limiting/src/index.ts:255">
P2: The main() function says to uncomment examples, but all example calls are active, so running this file executes every example by default (including rate-limited/delay ones). This is misleading and can trigger unexpected API usage/delays.</violation>
</file>

<file name="examples/with-rate-limiting/README.md">

<violation number="1" location="examples/with-rate-limiting/README.md:22">
P2: README provider setup text is inconsistent with the actual provider/env var, which can cause incorrect credential configuration.</violation>

<violation number="2" location="examples/with-rate-limiting/README.md:51">
P2: The `rateLimits.providers` configuration shown here is not implemented — `AgentRateLimitConfig` has no `providers` field and `types.ts` explicitly notes provider-specific limits are unimplemented. This snippet will mislead users into thinking provider-scoped limits work. Remove it or clearly mark it as a future/unimplemented feature.</violation>
</file>

<file name="packages/core/src/rate-limit/manager.spec.ts">

<violation number="1" location="packages/core/src/rate-limit/manager.spec.ts:50">
P2: The resetAll test expects a RateLimitExceededError after 5 requests, but the configured LLM limit is 10. The rejection assertion is unreachable with this setup, so the test doesn’t actually validate reset behavior and may fail incorrectly.</violation>

<violation number="2" location="packages/core/src/rate-limit/manager.spec.ts:175">
P2: Test description says missing `maxRequestsPerMinute` should result in "no rate limiting applied", but the implementation defaults to 60 rpm when `config.llm` exists (see `createLimiter`: `config.maxRequestsPerMinute || 60`). This test passes only because 10 < 60. Either gate limiter creation on `maxRequestsPerMinute` being explicitly set (to truly skip rate limiting), or update the test to document the defaulting behavior.</violation>
</file>

<file name="packages/core/src/rate-limit/limiters/fixed-window.ts">

<violation number="1" location="packages/core/src/rate-limit/limiters/fixed-window.ts:53">
P2: The delay path in acquire() is not concurrency-safe: multiple callers that exceed the limit can await concurrently and then each reset `count/windowStart`, causing lost increments and allowing more than `limit` requests in a window.</violation>
</file>

<file name="packages/core/src/rate-limit/manager.ts">

<violation number="1" location="packages/core/src/rate-limit/manager.ts:109">
P2: Using `||` for `maxRequestsPerMinute` treats an explicit `0` as “unset” and silently applies the default limit; there’s also no validation to reject invalid values. This can weaken rate limiting if a misconfigured zero is provided.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Nitpick comments (4)
packages/core/src/rate-limit/manager.ts (2)

131-140: The default case returning "unknown" may mask configuration errors.

If an unexpected scopeId.type is passed, the function silently returns "unknown", which could cause multiple unrelated scopes to share the same limiter. Consider logging a warning or throwing an error for unexpected types.

Suggested improvement
   private getScopeKey(scopeId: RateLimitScopeId): string {
     switch (scopeId.type) {
       case "global":
         return "global:llm";
       case "tool":
         return `tool:${scopeId.agentId}:${scopeId.toolName}`;
       default:
+        this.logger?.warn(`Unknown rate limit scope type: ${(scopeId as RateLimitScopeId).type}`);
         return "unknown";
     }
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/rate-limit/manager.ts` around lines 131 - 140, The
getScopeKey function currently returns "unknown" for unexpected
RateLimitScopeId.type values which can mask config errors; update getScopeKey to
explicitly handle invalid types by throwing an error (or at minimum logging a
warning) that includes the unexpected scopeId.type and the full scopeId object
so callers can identify the source, and keep the existing cases for "global" and
"tool" intact; reference the getScopeKey method and the RateLimitScopeId type
when making this change.

31-42: Consider documenting the implicit rate limiting when llm config is present but incomplete.

When config.llm exists (even without maxRequestsPerMinute), a limiter is created with a default of 60 requests/min. This could be surprising - an empty llm: {} config silently enables rate limiting.

Consider either:

  1. Requiring maxRequestsPerMinute to be set for the limiter to be created
  2. Documenting this behavior clearly
Option 1: Only create limiter when maxRequestsPerMinute is explicitly set
   async checkLLMRateLimit(): Promise<void> {
     // Check global LLM limit (if configured)
-    if (this.config.llm) {
+    if (this.config.llm?.maxRequestsPerMinute) {
       const scopeId: RateLimitScopeId = {
         type: "global",
       };
       const limiter = this.getLimiter(scopeId, this.config.llm);
       await limiter.acquire();
     }
 
     // No rate limit configured - allow through
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/rate-limit/manager.ts` around lines 31 - 42, The code in
checkLLMRateLimit currently creates a limiter whenever config.llm is truthy,
which silently enables a default 60rpm limiter; update checkLLMRateLimit to only
create/await a limiter when config.llm.maxRequestsPerMinute is explicitly set
(or non-null) to avoid implicit enforcement, i.e. check config.llm && typeof
config.llm.maxRequestsPerMinute === "number" before constructing
RateLimitScopeId and calling this.getLimiter/acquire; reference the
checkLLMRateLimit method, config.llm, maxRequestsPerMinute, RateLimitScopeId and
getLimiter to locate and change the logic accordingly.
packages/core/src/rate-limit/types.ts (1)

43-64: Asymmetric optionality between LLM and tool rate limit configs.

LLMRateLimitConfig.maxRequestsPerMinute is optional (line 45), but ToolRateLimitConfig.maxRequestsPerMinute is required (line 59). This asymmetry may be intentional, but it means:

  • LLM rate limiting can be configured without an explicit limit (defaulting to 60/min in the manager)
  • Tool rate limiting requires an explicit limit

If this is intentional, consider adding a JSDoc comment explaining why. If not, consider making them consistent.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/rate-limit/types.ts` around lines 43 - 64, The two
interfaces LLMRateLimitConfig and ToolRateLimitConfig have inconsistent
optionality for maxRequestsPerMinute (LLMRateLimitConfig.maxRequestsPerMinute is
optional, ToolRateLimitConfig.maxRequestsPerMinute is required); decide which
behavior is intended and make them consistent: either change
ToolRateLimitConfig.maxRequestsPerMinute to optional to match
LLMRateLimitConfig, or make LLMRateLimitConfig.maxRequestsPerMinute required to
match ToolRateLimitConfig, and update any callers/tests accordingly; if the
asymmetry is intentional instead, add a clarifying JSDoc above the affected
symbols (LLMRateLimitConfig and/or ToolRateLimitConfig) explaining why one
requires an explicit limit while the other can rely on a manager default.
packages/core/src/rate-limit/manager.spec.ts (1)

163-166: Misleading variable name.

The variable openaiStats suggests provider-specific (OpenAI) stats, but per the PR summary, provider-specific configurations were removed. This is global LLM stats.

Suggested fix
       const stats = manager.getAllStats();
       expect(Object.keys(stats).length).toBe(1); // Only one limiter should exist

-      const openaiStats = Object.values(stats)[0];
-      expect(openaiStats.current).toBe(2); // Both requests counted
+      const llmStats = Object.values(stats)[0];
+      expect(llmStats.current).toBe(2); // Both requests counted
     });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/rate-limit/manager.spec.ts` around lines 163 - 166, The
test uses a misleading variable name openaiStats in
packages/core/src/rate-limit/manager.spec.ts; rename that variable to something
reflecting global LLM stats (e.g., llmStats or globalStats) and update the
inline comment that asserts its value (expect(...).toBe(2)) to reference
"LLM/global stats" instead of "Both requests counted" being tied to OpenAI;
locate the variable by the identifier openaiStats and the surrounding stats
usage and change all occurrences to the new name to keep the test clear and
consistent with the provider-agnostic design.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/with-rate-limiting/.gitignore`:
- Around line 1-4: The example's .gitignore is missing patterns to exclude local
environment files, so add entries for ".env" and common variants like
".env.local" (and optionally ".env.*" if you want to exclude all env files) to
the package's .gitignore to prevent accidental commits of real API keys; update
the existing .gitignore file to include these patterns alongside the current
entries (node_modules, dist, .DS_Store, .voltagent).

In `@examples/with-rate-limiting/README.md`:
- Around line 22-26: The README incorrectly tells users to "Set your OpenAI API
key" while showing the environment variable GOOGLE_GENERATIVE_AI_API_KEY; update
the setup instructions so the provider name and env var match: either change the
descriptive text to "Set your Google Generative AI API key" to match
GOOGLE_GENERATIVE_AI_API_KEY, or change the shown variable to the correct OpenAI
variable (e.g., OPENAI_API_KEY) to match the OpenAI wording; ensure the text and
the environment variable (GOOGLE_GENERATIVE_AI_API_KEY or OPENAI_API_KEY) are
consistent.

In `@examples/with-rate-limiting/src/index.ts`:
- Around line 262-265: The console message after examples finishes is
misleading; update the console.log call in the try block (the message that
currently reads "Note: Uncomment examples in main() to run them.") to reflect
reality (e.g., "Examples executed" or remove the note) so it no longer instructs
to uncomment examples in main(); locate the final console.log in main() / the
try block and replace the outdated string with an accurate status message.
- Around line 65-67: The log incorrectly labels
RateLimitExceededError.retryAfter as seconds while retryAfter is in
milliseconds; update the logging in examples/with-rate-limiting/src/index.ts to
convert error.retryAfter from milliseconds to seconds (e.g.,
Math.ceil(error.retryAfter / 1000)) or change the label to "milliseconds" so it
matches the value returned by RateLimitExceededError.retryAfter in errors.ts;
use the symbol RateLimitExceededError.retryAfter when locating the source of the
mismatch.

In `@packages/core/src/agent/agent.ts`:
- Around line 542-555: The provider extracted by extractProviderFromModel(model)
is only used for logging and never passed to RateLimitManager, so
provider-specific limits (e.g., rateLimits.providers.openai) are ignored; update
all places where rateLimitManager.checkLLMRateLimit() is called (including the
blocks around extractProviderFromModel at the top-level LLM call, the other
similar block later, and the long-running generation block) to pass the provider
(or provider id) into checkLLMRateLimit(provider) and adjust the
RateLimitManager.checkLLMRateLimit signature/implementation accordingly to
accept and use the provider to select per-provider configs, keeping the existing
debug log calls (methodLogger.debug) intact but now relying on the provider
argument to enforce provider-specific limits.
- Around line 542-549: generateObject() and streamObject() are bypassing the LLM
quota because they don't call the rate limiter; add the same rate-limit check
used in generateText()/streamText() to both generateObject and streamObject (and
their streaming counterparts referenced around the other block), i.e., extract
provider via extractProviderFromModel(model), set modelId = modelName, then
await this.rateLimitManager.checkLLMRateLimit(provider, modelId) (or the same
call signature used in generateText/streamText) before invoking the provider
directly; update both functions to early-await the rateLimitManager when
this.rateLimitManager exists so object-generation cannot skip the llm quota.

In `@packages/core/src/rate-limit/limiters/fixed-window.ts`:
- Around line 66-79: The delay branch in the acquire logic of fixed-window
limiter (the method that checks this.count, this.windowStart and calls
this.delay) can race because multiple waiters reset this.count and
this.windowStart concurrently; modify the acquire() implementation in
fixed-window.ts to serialize delayed waiters by introducing a per-window pending
promise or FIFO queue (e.g., a shared Promise/resolve pair or waiter queue on
the limiter instance) so only one waiter resets the window and allows the next
waiter to proceed and update count; ensure the code that currently does "await
this.delay(waitTime); this.count = 0; this.windowStart = Date.now();" is
replaced with enqueuing/waiting on the shared waiter mechanism and that
increments to this.count happen under that serialized path to enforce the limit.

In `@packages/core/src/rate-limit/manager.spec.ts`:
- Around line 132-143: The test incorrectly expects a rejection after 5 requests
even though the test setup (beforeEach) sets maxRequestsPerMinute to 10; update
the test that uses manager.checkLLMRateLimit() so it consumes all 10 allowed
requests (for loop i < 10) and then assert that the 11th call throws
RateLimitExceededError, keeping the manager.resetAll() call and final assertion
that a post-reset checkLLMRateLimit() resolves; reference the methods
manager.checkLLMRateLimit, manager.resetAll and the RateLimitExceededError when
making the change.

---

Nitpick comments:
In `@packages/core/src/rate-limit/manager.spec.ts`:
- Around line 163-166: The test uses a misleading variable name openaiStats in
packages/core/src/rate-limit/manager.spec.ts; rename that variable to something
reflecting global LLM stats (e.g., llmStats or globalStats) and update the
inline comment that asserts its value (expect(...).toBe(2)) to reference
"LLM/global stats" instead of "Both requests counted" being tied to OpenAI;
locate the variable by the identifier openaiStats and the surrounding stats
usage and change all occurrences to the new name to keep the test clear and
consistent with the provider-agnostic design.

In `@packages/core/src/rate-limit/manager.ts`:
- Around line 131-140: The getScopeKey function currently returns "unknown" for
unexpected RateLimitScopeId.type values which can mask config errors; update
getScopeKey to explicitly handle invalid types by throwing an error (or at
minimum logging a warning) that includes the unexpected scopeId.type and the
full scopeId object so callers can identify the source, and keep the existing
cases for "global" and "tool" intact; reference the getScopeKey method and the
RateLimitScopeId type when making this change.
- Around line 31-42: The code in checkLLMRateLimit currently creates a limiter
whenever config.llm is truthy, which silently enables a default 60rpm limiter;
update checkLLMRateLimit to only create/await a limiter when
config.llm.maxRequestsPerMinute is explicitly set (or non-null) to avoid
implicit enforcement, i.e. check config.llm && typeof
config.llm.maxRequestsPerMinute === "number" before constructing
RateLimitScopeId and calling this.getLimiter/acquire; reference the
checkLLMRateLimit method, config.llm, maxRequestsPerMinute, RateLimitScopeId and
getLimiter to locate and change the logic accordingly.

In `@packages/core/src/rate-limit/types.ts`:
- Around line 43-64: The two interfaces LLMRateLimitConfig and
ToolRateLimitConfig have inconsistent optionality for maxRequestsPerMinute
(LLMRateLimitConfig.maxRequestsPerMinute is optional,
ToolRateLimitConfig.maxRequestsPerMinute is required); decide which behavior is
intended and make them consistent: either change
ToolRateLimitConfig.maxRequestsPerMinute to optional to match
LLMRateLimitConfig, or make LLMRateLimitConfig.maxRequestsPerMinute required to
match ToolRateLimitConfig, and update any callers/tests accordingly; if the
asymmetry is intentional instead, add a clarifying JSDoc above the affected
symbols (LLMRateLimitConfig and/or ToolRateLimitConfig) explaining why one
requires an explicit limit while the other can rely on a manager default.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0894e402-5803-40df-9f49-99ab3ee6f2df

📥 Commits

Reviewing files that changed from the base of the PR and between bac1f49 and 59575ae.

📒 Files selected for processing (16)
  • examples/with-rate-limiting/.env.example
  • examples/with-rate-limiting/.gitignore
  • examples/with-rate-limiting/README.md
  • examples/with-rate-limiting/package.json
  • examples/with-rate-limiting/src/index.ts
  • packages/core/src/agent/agent.ts
  • packages/core/src/agent/types.ts
  • packages/core/src/index.ts
  • packages/core/src/rate-limit/errors.ts
  • packages/core/src/rate-limit/index.ts
  • packages/core/src/rate-limit/limiters/fixed-window.spec.ts
  • packages/core/src/rate-limit/limiters/fixed-window.ts
  • packages/core/src/rate-limit/limiters/index.ts
  • packages/core/src/rate-limit/manager.spec.ts
  • packages/core/src/rate-limit/manager.ts
  • packages/core/src/rate-limit/types.ts

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
examples/with-rate-limiting/README.md (1)

9-10: Feature bullets 3 and 4 are redundant and can be merged for clarity.

This reads like the same concept twice in the top-level list. Consider renaming one to match a distinct scenario (e.g., provider-specific).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/with-rate-limiting/README.md` around lines 9 - 10, The two README
bullets "Tool-Specific Rate Limiting" and "Tool Rate Limiting" are redundant;
update the list to merge them into a single clear bullet (e.g., "Tool-Specific
Rate Limiting — set different limits per tool/provider") or rename one to a
distinct scenario such as "Provider-Specific Rate Limiting" so the top-level
list has unique, non-overlapping items and improved clarity.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/with-rate-limiting/README.md`:
- Around line 112-126: The Example 3 section titled "Tool-Specific Rate
Limiting" is inconsistent with the PR (it should show provider-specific limits);
update the Example 3 content to describe provider-specific rate limits instead
of tool-specific ones — rename the heading to "Provider-Specific Rate Limiting"
(or similar), list example limits for providers (e.g., `google_api`: 100
requests/minute, `global`: 1000 requests/minute), state that each provider has
independent rate limit counters, and keep the Key Features bullet points aligned
(multiple providers with different limits, per-provider tracking, `onExceeded:
"throw"` for strict enforcement) so the documentation matches the
implementation.
- Around line 34-35: The README snippet advertises unsupported strategy values;
update the example config so the strategy field only lists the implemented
option ("fixed_window") and remove the misleading mentions of "sliding_window",
"token_bucket", and "leaky_bucket" (or add a clear TODO note that other
strategies are not yet implemented) so readers won’t pass non-functional
options; update the lines referencing strategy and keep the onExceeded example
as-is (refer to the "strategy" and "onExceeded" fields in the snippet).
- Around line 185-189: The main() snippet wrongly calls
example4_toolRateLimiting() duplicating the tool-rate-limiting example; replace
that duplicated call so the Example 3 slot invokes the provider-specific example
(i.e., call example3_providerRateLimiting() instead of
example4_toolRateLimiting()) so the sequence runs example1_basicLLMRateLimit(),
example2_delayStrategy(), example3_providerRateLimiting(),
example4_toolRateLimiting(), example5_combinedLimits().

---

Nitpick comments:
In `@examples/with-rate-limiting/README.md`:
- Around line 9-10: The two README bullets "Tool-Specific Rate Limiting" and
"Tool Rate Limiting" are redundant; update the list to merge them into a single
clear bullet (e.g., "Tool-Specific Rate Limiting — set different limits per
tool/provider") or rename one to a distinct scenario such as "Provider-Specific
Rate Limiting" so the top-level list has unique, non-overlapping items and
improved clarity.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3b8f61b4-bd33-4b6a-b837-d1ebc5792fc1

📥 Commits

Reviewing files that changed from the base of the PR and between 59575ae and 486f11b.

📒 Files selected for processing (1)
  • examples/with-rate-limiting/README.md

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 2 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="examples/with-rate-limiting/README.md">

<violation number="1" location="examples/with-rate-limiting/README.md:34">
P2: README documents unsupported rate limiting strategies (`sliding_window`, `token_bucket`, `leaky_bucket`) even though core implementation only supports `fixed_window`.</violation>

<violation number="2" location="examples/with-rate-limiting/README.md:61">
P2: README documents unsupported `onLimitExceeded` rate-limit hook and context fields, creating a false API contract for users.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@sujal12344
Copy link
Copy Markdown
Contributor Author

@omeraplak I addressed all issues and created a hook also, please take a look and give me feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants