You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/oss/javascript/integrations/middleware/anthropic.mdx
+5-3Lines changed: 5 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,9 @@ Middleware specifically designed for Anthropic's Claude models. Learn more about
10
10
11
11
## Prompt caching
12
12
13
-
Reduce costs and latency by caching static or repetitive prompt content (like system prompts, tool definitions, and conversation history) on Anthropic's servers. This middleware implements a **conversational caching strategy** that places cache breakpoints after the most recent message, allowing the entire conversation history (including the latest user message) to be cached and reused in subsequent API calls. Prompt caching is useful for the following:
13
+
Reduce costs and latency by caching static or repetitive prompt content (like system prompts, tool definitions, and conversation history) on Anthropic's servers. This middleware implements a **conversational caching strategy** that places cache breakpoints after the most recent message, allowing the entire conversation history (including the latest user message) to be cached and reused in subsequent API calls.
14
+
15
+
Prompt caching is useful for the following:
14
16
15
17
- Applications with long, static system prompts that don't change between requests
16
18
- Agents with many tool definitions that remain constant across invocations
@@ -44,8 +46,8 @@ const agent = createAgent({
44
46
The middleware caches content up to and including the latest message in each request. On subsequent requests within the TTL window (5 minutes or 1 hour), previously seen content is retrieved from cache rather than reprocessed, significantly reducing costs and latency.
45
47
46
48
**How it works:**
47
-
1. First request: System prompt, tools, and the user message "Hi, my name is Bob" are sent to the API and cached
48
-
2. Second request: The cached content (system prompt, tools, and first message) is retrieved from cache. Only the new message "What's my name?" needs to be processed, plus the model's response from the first request
49
+
1. First request: System prompt, tools, and the user message *"Hi, my name is Bob"* are sent to the API and cached
50
+
2. Second request: The cached content (system prompt, tools, and first message) is retrieved from cache. Only the new message *"What's my name?"* needs to be processed, plus the model's response from the first request
49
51
3. This pattern continues for each turn, with each request reusing the cached conversation history
Copy file name to clipboardExpand all lines: src/oss/python/integrations/middleware/anthropic.mdx
+17-7Lines changed: 17 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,9 @@ Middleware specifically designed for Anthropic's Claude models. Learn more about
14
14
15
15
## Prompt caching
16
16
17
-
Reduce costs and latency by caching static or repetitive prompt content (like system prompts, tool definitions, and conversation history) on Anthropic's servers. This middleware implements a **conversational caching strategy** that places cache breakpoints after the most recent message, allowing the entire conversation history (including the latest user message) to be cached and reused in subsequent API calls. Prompt caching is useful for the following:
17
+
Reduce costs and latency by caching static or repetitive prompt content (like system prompts, tool definitions, and conversation history) on Anthropic's servers. This middleware implements a **conversational caching strategy** that places cache breakpoints after the most recent message, allowing the entire conversation history (including the latest user message) to be cached and reused in subsequent API calls.
18
+
19
+
Prompt caching is useful for the following:
18
20
19
21
- Applications with long, static system prompts that don't change between requests
20
22
- Agents with many tool definitions that remain constant across invocations
@@ -64,8 +66,8 @@ agent = create_agent(
64
66
The middleware caches content up to and including the latest message in each request. On subsequent requests within the TTL window (5 minutes or 1 hour), previously seen content is retrieved from cache rather than reprocessed, significantly reducing costs and latency.
65
67
66
68
**How it works:**
67
-
1. First request: System prompt, tools, and the user message "Hi, my name is Bob" are sent to the API and cached
68
-
2. Second request: The cached content (system prompt, tools, and first message) is retrieved from cache. Only the new message "What's my name?" needs to be processed, plus the model's response from the first request
69
+
1. First request: System prompt, tools, and the user message *"Hi, my name is Bob"* are sent to the API and cached
70
+
2. Second request: The cached content (system prompt, tools, and first message) is retrieved from cache. Only the new message *"What's my name?"* needs to be processed, plus the model's response from the first request
69
71
3. This pattern continues for each turn, with each request reusing the cached conversation history
70
72
71
73
```python
@@ -99,7 +101,9 @@ agent.invoke({"messages": [HumanMessage("What's my name?")]})
99
101
100
102
## Bash tool
101
103
102
-
Execute Claude's native `bash_20250124` tool with local command execution. The bash tool middleware is useful for the following:
104
+
Execute Claude's native `bash_20250124` tool with local command execution.
105
+
106
+
The bash tool middleware is useful for the following:
103
107
104
108
- Using Claude's built-in bash tool with local execution
Provide Claude's memory tool (`memory_20250818`) for persistent agent memory across conversation turns. The memory middleware is useful for the following:
297
+
Provide Claude's memory tool (`memory_20250818`) for persistent agent memory across conversation turns.
298
+
299
+
The memory middleware is useful for the following:
0 commit comments