Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 35 additions & 6 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,22 +38,23 @@ Override the LLM endpoint with `OPENAI_BASE_URL` (defaults to `https://api.opena

**Module map:**
- [src/lib.rs](src/lib.rs) — module re-exports.
- [src/builder.rs](src/builder.rs) — `AgentWorkerBuilder` fluent builder; wires LLM + tools into a Temporal `Worker`.
- [src/builder.rs](src/builder.rs) — `AgentWorkerBuilder` fluent builder; wires LLM + tools + memory provider into a Temporal `Worker`.
- [src/workflow.rs](src/workflow.rs) — `AgentWorkflow` with `#[run]`, `#[signal] add_user_message`, `#[query] get_state`, `#[query] turn_count`. Owns the ReAct loop.
- [src/activities.rs](src/activities.rs) — `AgentActivities::llm_chat` and `AgentActivities::execute_tool`. The *only* place LLM providers and tool implementations execute.
- [src/llm.rs](src/llm.rs) — translation between local `Message`/`ToolSchema` types and AutoAgents `ChatMessage`/`LlmTool`; native-tool-call parsing with fenced-JSON fallback. The only file that touches `autoagents_llm` types in the hot path (`src/llm.rs:6`).
- [src/state.rs](src/state.rs) — `AgentInput`, `AgentOutput`, `AgentState`, `Message`, `ToolCall`, `ToolResult`, `ToolSchema`, `LlmResponse`, `StopReason`, plus `compact()`.
- [src/state.rs](src/state.rs) — `AgentInput`, `AgentOutput`, `AgentState`, `Message`, `ToolCall`, `ToolResult`, `ToolSchema`, `LlmResponse`, `StopReason`.
- [src/memory.rs](src/memory.rs) — `MemoryProvider` trait, default `SlidingWindowMemory` impl, and the `compact_sliding_window` kernel. Pluggable compaction strategy consulted by the workflow before every turn.
- [src/tool.rs](src/tool.rs) — `ToolRegistry` (immutable name→impl map) and its builder.
- [src/error.rs](src/error.rs) — `AgentError` with `is_retryable()` to distinguish transient vs. permanent.
- [src/prelude.rs](src/prelude.rs) — convenience re-exports including AutoAgents traits (`ToolT`, `LLMProvider`, `ToolRuntime`, `ToolCallError`).
- [src/prelude.rs](src/prelude.rs) — convenience re-exports including AutoAgents traits (`ToolT`, `LLMProvider`, `ToolRuntime`, `ToolCallError`) and memory types (`MemoryProvider`, `SlidingWindowMemory`).

**Public API surface (what a user actually touches):** `AgentWorkerBuilder`, `AgentWorkflow`, `AgentInput`/`AgentOutput`, `ToolRegistry`. Users supply their own `Arc<dyn LLMProvider>` and `Arc<dyn ToolT>` from AutoAgents.
**Public API surface (what a user actually touches):** `AgentWorkerBuilder`, `AgentWorkflow`, `AgentInput`/`AgentOutput`, `ToolRegistry`, `MemoryProvider`/`SlidingWindowMemory`. Users supply their own `Arc<dyn LLMProvider>` and `Arc<dyn ToolT>` from AutoAgents.

**Non-obvious behaviors to preserve when editing:**

- **History compaction.** When `AgentState::history.len()` exceeds `CONTINUE_AS_NEW_THRESHOLD = 200` (`src/workflow.rs:36`), the workflow calls `continue_as_new` with a compacted state: summary prepended to the system prompt, last 20 messages kept (`src/state.rs::compact`). Any change to the message shape needs to round-trip through `compact()`.
- **History compaction is pluggable.** The workflow consults `MemoryProvider::should_compact` before every turn; on `true` it calls `MemoryProvider::compact` and `continue_as_new` with the returned `AgentInput`. Default provider is `SlidingWindowMemory` (`compact_threshold = 200`, `keep_recent = 20`), preserving the legacy hardcoded behavior. Override via `AgentWorkerBuilder::memory(Arc::new(SlidingWindowMemory::new().with_compact_threshold(N).with_keep_recent(K)))` or supply your own `Arc<dyn MemoryProvider>`. Trait impls MUST be pure and sync — they run inside the deterministic workflow body. The kernel summarizer lives at `src/memory.rs::compact_sliding_window`; any change to the `Message` shape needs to round-trip through it.
- **Tool error semantics.** Tool-side failures return `Ok(ToolResult { error: Some(...) })` so the LLM can see and recover from them (`src/activities.rs:59-88`). Only infrastructure errors (missing tool, serde failure) surface as activity `Err`, which Temporal retries.
- **`WORKER_TOOL_CATALOG`.** A process-global `OnceCell` set once at worker init in `build_worker` (`src/builder.rs:34`). The deterministic workflow body reads it on every replay, so it must be set before the worker starts and never mutated after.
- **Process-global worker config (`WORKER_TOOL_CATALOG`, `WORKER_MEMORY`).** Two `OnceCell`s in `src/builder.rs` published by `build_worker`. The deterministic workflow body reads them on every replay, so they must be set before the worker starts and never mutated after. Building a second worker in the same process with a *different* catalog (compared by `PartialEq`) or a different memory `Arc` (compared by `Arc::ptr_eq`) returns `AgentError::Other` — multi-worker setups in one process must share the same `Arc<dyn MemoryProvider>` and register identical tools in the same order.
- **Activity timeouts.** Set inside `AgentWorkflow::run` at `src/workflow.rs:66-74`: LLM activity 120s start-to-close / 30s heartbeat, tool activity **3600s** start-to-close (generous on purpose — supports human-in-the-loop tools that block on stdin/HTTP/async-completion).
- **Mid-conversation user input.** The `add_user_message` signal pushes into `pending_user_messages`, drained at the top of each loop iteration (`src/workflow.rs:145-152`). Don't mutate `history` directly from signal handlers — that races with the in-flight `llm_chat` activity.
- **Dual LLM response parsing.** `src/llm.rs` tries native tool calls first, then falls back to a fenced `\`\`\`tool_calls` JSON block so non-OpenAI providers still work.
Expand All @@ -62,6 +63,34 @@ Override the LLM endpoint with `OPENAI_BASE_URL` (defaults to `https://api.opena
- Tools must be side-effect-safe on retry.
- `LLMProvider` and `ToolT` impls must be `Send + Sync + 'static`.
- Never invoke `LLMProvider` or `ToolT` from workflow code — only from activities.
- `MemoryProvider` impls must be pure, sync, and stateless (config-only) — `should_compact` and `compact` are called inside the workflow body and must return identical results on replay for the same `AgentState`.

## Documentation maintenance

After any change large enough to alter the public API surface, observable
behavior, defaults, or feature set, update the user-facing docs in the same
PR — stale docs are worse than no docs because they actively mislead.
Specifically:

- **[AGENTS.md](AGENTS.md)** (this file) — update the module map, public API
surface line, non-obvious behaviors, and determinism contract whenever any
of them change. Add new module entries here as soon as you create them.
- **[README.md](README.md)** — update the features list, examples list,
user-facing determinism contract, and any code snippets affected by API
changes. If you add a feature with its own knobs (caching, fallback,
memory backends, etc.), give it a short dedicated section like
"Pluggable memory backends" so users can find it without reading the
source.
- **[examples/](examples/)** — when adding a new top-level feature, ship
one runnable example that exercises it (model the new example on
`simple_math_agent` — same worker/client/status mode template, same
three-terminal flow). Register it in `Cargo.toml` under a new
`[[example]]` entry and add a one-line description plus a runnable
invocation to README.md's "Running the examples" block.

Rule of thumb: if you touched `src/lib.rs` re-exports, `src/prelude.rs`, or
`AgentWorkerBuilder`'s public API, you owe at least one edit to each of the
three above.

## Version pins

Expand Down
4 changes: 4 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -70,3 +70,7 @@ path = "examples/pipelined_math_agent/main.rs"
[[example]]
name = "structured_output_agent"
path = "examples/structured_output_agent/main.rs"

[[example]]
name = "tunable_memory_agent"
path = "examples/tunable_memory_agent/main.rs"
67 changes: 66 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,10 @@ LLM tokens.
- `AgentWorkerBuilder` for one-line worker setup.
- Provider-agnostic: bring your own `Arc<dyn LLMProvider>` (OpenAI,
Anthropic, Ollama, etc. — anything supported by `autoagents_llm`).
- **Pluggable memory backends** via the `MemoryProvider` trait — default
`SlidingWindowMemory` matches the legacy hardcoded behavior; swap in
custom strategies through `AgentWorkerBuilder::memory`. See
[Pluggable memory backends](#pluggable-memory-backends).
- **Human-in-the-loop as a regular tool** — the library does not
special-case any tool name. See
[Human-in-the-loop tools](#human-in-the-loop-tools).
Expand Down Expand Up @@ -159,16 +163,62 @@ by it.
See `examples/pipelined_math_agent` for a runnable demo (`add` tool +
`PipelineBuilder(CacheLayer → FallbackLayer)` around two OpenAI models).

## Pluggable memory backends

History compaction is governed by an `Arc<dyn MemoryProvider>` published
to the worker via `AgentWorkerBuilder::memory`. The default — used when
`.memory(...)` is not called — is `SlidingWindowMemory` with
`compact_threshold = 200` and `keep_recent = 20`, which matches the
legacy hardcoded behavior.

```rust,ignore
use std::sync::Arc;
use temporal_agent_rs::prelude::*;

let memory: Arc<dyn MemoryProvider> = Arc::new(
SlidingWindowMemory::new()
.with_compact_threshold(50)
.with_keep_recent(10),
);

AgentWorkerBuilder::new(client)
.llm(llm)
.tool(my_tool)
.memory(memory)
.build_worker(&runtime)?;
```

**Trait contract.** Implementations MUST be pure and synchronous —
`should_compact` and `compact` run inside the deterministic workflow
body and must return identical results on every replay for the same
`AgentState`. Per-conversation state belongs in `AgentState` (which
Temporal persists in workflow history), never in fields on the provider.

**Multi-worker setups.** Running multiple workers in the same process on
the same queue requires sharing the *same* `Arc<dyn MemoryProvider>` —
the builder fails fast (via `Arc::ptr_eq`) on mismatching instances to
prevent the second worker from silently inheriting the first worker's
provider while replay diverges.

See `examples/tunable_memory_agent` for a runnable demo of a tuned
`SlidingWindowMemory` plus a minimal custom `MemoryProvider` impl
(`KeepEverythingMemory`) gated behind a `KEEP_EVERYTHING=1` env switch.

## Running the examples

Three examples ship with the crate:
Five examples ship with the crate:

- `simple_math_agent` — minimal autonomous loop with a single `add` tool.
- `interactive_math_agent` — adds an `ask_user` tool so the agent can pause
for human input on the worker's stdin.
- `pipelined_math_agent` — same `add` tool, but the provider is wrapped with
`PipelineBuilder → CacheLayer → FallbackLayer` to demonstrate the
composition pattern described above.
- `structured_output_agent` — forces a JSON-schema-shaped final answer via
`AgentInput::output_schema`.
- `tunable_memory_agent` — demonstrates a tuned `SlidingWindowMemory` and a
custom `MemoryProvider` impl; aggressive thresholds make `continue_as_new`
compaction observable on a short conversation.

```bash
# Terminal 1: local Temporal dev server (install via `brew install temporal` or temporal.io)
Expand All @@ -190,6 +240,17 @@ cargo run --example interactive_math_agent -- client
# run the client twice with the same prompt to observe the cache layer.
OPENAI_API_KEY=sk-... cargo run --example pipelined_math_agent -- worker
cargo run --example pipelined_math_agent -- client

# Structured output — final answer constrained by a JSON schema.
OPENAI_API_KEY=sk-... cargo run --example structured_output_agent -- worker
cargo run --example structured_output_agent -- client

# Pluggable memory backends — aggressive SlidingWindowMemory so compaction
# fires mid-run. Use the `status` sub-command to watch history.len() and the
# "Prior conversation summary" marker appear in the system prompt.
OPENAI_API_KEY=sk-... cargo run --example tunable_memory_agent -- worker
cargo run --example tunable_memory_agent -- client
cargo run --example tunable_memory_agent -- status
```

The Temporal Web UI is at http://localhost:8233. Click into the workflow to
Expand Down Expand Up @@ -319,6 +380,10 @@ When you write tools and provider configs:
- Never call your `LLMProvider` or your `ToolT` from inside workflow code.
The workflow holds tools by name; the only path to invocation is the
`execute_tool` activity.
- `MemoryProvider` impls must be **pure, sync, and stateless** (config
only) — `should_compact` and `compact` run inside the workflow body and
must return identical results on replay for the same `AgentState`. Keep
conversation state in `AgentState`, never on the provider.

## Version compatibility

Expand Down
Loading
Loading