diff --git a/.mintignore b/.mintignore new file mode 100644 index 0000000..21d0b89 --- /dev/null +++ b/.mintignore @@ -0,0 +1 @@ +.venv/ diff --git a/concepts/interfaces/Overview.mdx b/concepts/interfaces/Overview.mdx index 89aa3fc..3b60928 100644 --- a/concepts/interfaces/Overview.mdx +++ b/concepts/interfaces/Overview.mdx @@ -34,6 +34,12 @@ Interfaces enable exposing Upsonic agents through various communication protocol + + + Universal email interface that works with any mail provider (Gmail, Outlook, Yahoo, self-hosted, etc.) + + + diff --git a/concepts/interfaces/mail.mdx b/concepts/interfaces/mail.mdx new file mode 100644 index 0000000..5a43a1b --- /dev/null +++ b/concepts/interfaces/mail.mdx @@ -0,0 +1,339 @@ +--- +title: "Mail (SMTP/IMAP)" +sidebarTitle: "Mail" +description: "Host agents as email assistants using any mail provider" +--- + +Use the Mail interface to serve Agents via standard SMTP/IMAP email. It mounts API routes on a FastAPI app and enables automated email processing, replies, and full inbox management. Works with any mail provider (Gmail, Outlook, Yahoo, Zoho, self-hosted, etc.). + +## Installation + +Install the Mail interface dependencies: + +```bash +uv pip install "upsonic[mail-interface]" +``` + +## Setup + +Configuration via constructor parameters or environment variables: + +| Env Variable | Description | +| ------------ | ----------- | +| `MAIL_SMTP_HOST` | SMTP server hostname | +| `MAIL_SMTP_PORT` | SMTP server port (default: 587) | +| `MAIL_IMAP_HOST` | IMAP server hostname | +| `MAIL_IMAP_PORT` | IMAP server port (default: 993) | +| `MAIL_USERNAME` | Email account username | +| `MAIL_PASSWORD` | Email account password (or app password) | +| `MAIL_USE_SSL` | Use SSL for SMTP instead of STARTTLS (default: false) | +| `MAIL_API_SECRET` | Secret token for API endpoint protection (optional) | + + + + The interface uses a pull-based model - you trigger email checks via the `/mail/check` endpoint (or use heartbeat auto-poll), and the agent processes unread emails. + + + +### Common Provider Settings + +| Provider | SMTP Host | SMTP Port | IMAP Host | IMAP Port | +| -------- | --------- | --------- | --------- | --------- | +| Gmail | smtp.gmail.com | 587 | imap.gmail.com | 993 | +| Outlook | smtp.office365.com | 587 | outlook.office365.com | 993 | +| Yahoo | smtp.mail.yahoo.com | 587 | imap.mail.yahoo.com | 993 | +| Zoho | smtp.zoho.com | 587 | imap.zoho.com | 993 | + + + + Most providers require an **App Password** (not your regular password) when using SMTP/IMAP. For Gmail, enable 2-Step Verification first, then generate an App Password at [myaccount.google.com/apppasswords](https://myaccount.google.com/apppasswords). For Gmail, you also need to enable IMAP in Settings > Forwarding and POP/IMAP. + + + +## Operating Modes + +* **TASK** (default) -- Each email is processed as an independent task; no conversation history. The agent decides to reply or ignore. Best for classification, auto-responders, one-off processing. +* **CHAT** -- Emails from the same sender share a conversation session. The agent remembers context from previous emails. Best for support threads and ongoing conversations. + +## Reset Command (CHAT mode only) + +In CHAT mode, senders can clear their conversation by sending an email with the reset command in the body (e.g. `/reset`). Configure it with `reset_command`; set to `None` to disable. + +If the agent has a `workspace` configured, the reset command will also trigger a dynamic greeting message based on the workspace configuration. See [Workspace](/concepts/agents/advanced/workspace) for details. + +## Heartbeat (Auto-Poll) + +When used with an `AutonomousAgent` that has `heartbeat=True`, the interface automatically polls the IMAP mailbox for new emails on a configurable interval. No need to manually trigger `/mail/check`. + +```python +from upsonic import AutonomousAgent +from upsonic.interfaces import InterfaceManager, MailInterface + +agent = AutonomousAgent( + model="openai/gpt-4o", + name="EmailBot", + heartbeat=True, + heartbeat_period=5, # Check every 5 minutes + heartbeat_message="Check for new emails and process them.", +) + +mail = MailInterface( + agent=agent, + smtp_host="smtp.gmail.com", + smtp_port=587, + imap_host="imap.gmail.com", + imap_port=993, + username="bot@gmail.com", + password="your-app-password", + mode="task", +) + +manager = InterfaceManager(interfaces=[mail]) + +if __name__ == "__main__": + manager.serve(host="0.0.0.0", port=8000) +``` + +## Access Control (Whitelist) + +Pass `allowed_emails` (list of email addresses). Only emails from those senders are processed; others are silently skipped and marked as read. Omit `allowed_emails` (or set `None`) to allow all senders. + +## Attachments + +The Mail interface fully supports attachments in both directions: + +* **Incoming**: Attachments on received emails are downloaded to temporary files and passed to the agent via `Task(context=...)`. Temp files are cleaned up automatically after processing. +* **Outgoing**: The agent can send emails with file attachments using the `send_email_with_attachments` and `send_reply_with_attachments` tools. + +## Multiple Recipients + +The `send_email` tool supports sending to multiple recipients with CC and BCC: + +```python +mail_tools.send_email( + to=["alice@example.com", "bob@example.com"], + subject="Team Update", + body="Hello team!", + cc="manager@example.com", + bcc=["audit@example.com"], +) +``` + +## Event Deduplication + +The interface automatically prevents processing the same email twice within a 5-minute window. This protects against rapid consecutive calls to `/mail/check`. + +## Example Usage + +Create an agent, expose it with the `MailInterface`, and serve via `InterfaceManager`. Example with **TASK** mode, API secret, and whitelist: + +```python +import os +from upsonic import Agent +from upsonic.interfaces import InterfaceManager, MailInterface, InterfaceMode + +agent = Agent( + model="openai/gpt-4o", + name="EmailAssistant", +) + +mail = MailInterface( + agent=agent, + smtp_host="smtp.gmail.com", + smtp_port=587, + imap_host="imap.gmail.com", + imap_port=993, + username="mybot@gmail.com", + password=os.getenv("MAIL_PASSWORD"), + api_secret=os.getenv("MAIL_API_SECRET"), + mode=InterfaceMode.TASK, + reset_command="/reset", + allowed_emails=["trusted@example.com", "support@mycompany.com"], +) + +manager = InterfaceManager(interfaces=[mail]) + +if __name__ == "__main__": + manager.serve(host="0.0.0.0", port=8000, reload=False) +``` + +### CHAT Mode Example + +```python +mail = MailInterface( + agent=agent, + smtp_host="smtp.gmail.com", + smtp_port=587, + imap_host="imap.gmail.com", + imap_port=993, + username="mybot@gmail.com", + password=os.getenv("MAIL_PASSWORD"), + mode=InterfaceMode.CHAT, + reset_command="/reset", +) +``` + +In CHAT mode, each sender gets their own conversation session. The agent remembers previous exchanges. Send `/reset` in an email body to start fresh. + +## Core Components + +* `MailInterface` (interface): Wraps an Upsonic `Agent` for SMTP/IMAP email via FastAPI. + +* `MailTools` (toolkit): Provides 17 agent-facing tools for email operations (send, receive, search, flag, delete, move, attachments). + +* `InterfaceManager.serve`: Serves the FastAPI app using Uvicorn. + +## `MailInterface` Interface + +Main entry point for Upsonic Mail applications. + +### Initialization Parameters + +| Parameter | Type | Default | Description | +| --------- | ---- | ------- | ----------- | +| `agent` | `Agent` | Required | Upsonic `Agent` instance. | +| `name` | `str` | `"Mail"` | Interface name. | +| `smtp_host` | `Optional[str]` | `None` | SMTP server hostname (or `MAIL_SMTP_HOST`). | +| `smtp_port` | `Optional[int]` | `None` | SMTP server port (or `MAIL_SMTP_PORT`, default: 587). | +| `imap_host` | `Optional[str]` | `None` | IMAP server hostname (or `MAIL_IMAP_HOST`). | +| `imap_port` | `Optional[int]` | `None` | IMAP server port (or `MAIL_IMAP_PORT`, default: 993). | +| `username` | `Optional[str]` | `None` | Email account username (or `MAIL_USERNAME`). | +| `password` | `Optional[str]` | `None` | Email account password (or `MAIL_PASSWORD`). | +| `use_ssl` | `bool` | `False` | Use SSL for SMTP instead of STARTTLS. | +| `from_address` | `Optional[str]` | `None` | Sender address (defaults to `username`). | +| `api_secret` | `Optional[str]` | `None` | Secret token for API authentication (or `MAIL_API_SECRET`). | +| `mode` | `Union[InterfaceMode, str]` | `InterfaceMode.TASK` | `TASK` or `CHAT`. | +| `reset_command` | `Optional[str]` | `"/reset"` | Text in email body that resets chat session (CHAT mode). Set `None` to disable. | +| `storage` | `Optional[Storage]` | `None` | Storage backend for chat sessions (CHAT mode). | +| `allowed_emails` | `Optional[List[str]]` | `None` | Whitelist of sender emails; only these are processed. `None` = allow all. | +| `mailbox` | `str` | `"INBOX"` | IMAP mailbox/folder to poll. | + +### Key Methods + +| Method | Parameters | Return Type | Description | +| ------ | ---------- | ----------- | ----------- | +| `attach_routes` | None | `APIRouter` | Returns the FastAPI router and attaches all endpoints. | +| `check_and_process_emails` | `count: int = 10` | `CheckEmailsResponse` | Trigger email check and processing. | +| `is_email_allowed` | `sender: str` | `bool` | Whether the sender is in the whitelist (or whitelist disabled). | +| `health_check` | None | `Dict` | Returns interface health status with IMAP connectivity check. | + +## Endpoints + +Mounted under the `/mail` prefix. All endpoints require the `X-Upsonic-Mail-Secret` header if `api_secret` is configured. + +### `POST /mail/check` + +* Triggers a check for unread emails and processes them through the agent. + +* Query parameter: `count` (default: 10) - maximum number of emails to process. + +* In TASK mode: agent decides to reply or ignore each email. + +* In CHAT mode: emails are routed to per-sender conversation sessions. + +* Returns: `200 CheckEmailsResponse` with `status`, `processed_count`, and `email_uids`. + +```bash +curl -X POST http://localhost:8000/mail/check \ + -H "X-Upsonic-Mail-Secret: your-secret" +``` + +### `GET /mail/inbox` + +* Lists the most recent emails (read and unread). + +* Query parameters: `count` (default: 20, max: 100), `mailbox` (default: INBOX). + +* Returns: `200 EmailListResponse` with `count` and `emails` array. + +### `GET /mail/unread` + +* Lists unread emails only. + +* Query parameters: `count` (default: 20, max: 100), `mailbox` (default: INBOX). + +* Returns: `200 EmailListResponse` with `count` and `emails` array. + +### `POST /mail/send` + +* Sends a new email. + +* Request body: `to` (string or array), `subject`, `body`, `cc` (optional), `bcc` (optional), `html` (optional boolean). + +* Supports single and multiple recipients. + +* Returns: `200 {"status": "success", "message": "..."}`. + +```bash +curl -X POST http://localhost:8000/mail/send \ + -H "X-Upsonic-Mail-Secret: your-secret" \ + -H "Content-Type: application/json" \ + -d '{ + "to": ["alice@example.com", "bob@example.com"], + "subject": "Hello", + "body": "Hello from Upsonic!", + "cc": "manager@example.com" + }' +``` + +### `POST /mail/search` + +* Searches emails using IMAP search criteria. + +* Request body: `query` (IMAP search string), `count` (default: 10), `mailbox` (default: INBOX). + +* Returns: `200 EmailListResponse` with matching emails. + +```bash +curl -X POST http://localhost:8000/mail/search \ + -H "X-Upsonic-Mail-Secret: your-secret" \ + -H "Content-Type: application/json" \ + -d '{"query": "FROM \"user@example.com\"", "count": 5}' +``` + +### `GET /mail/folders` + +* Lists all available mailboxes/folders on the IMAP server. + +* Returns: `200 {"status": "success", "folders": [...]}`. + +### `GET /mail/status` + +* Gets the status of a mailbox (total, unseen, recent message counts). + +* Query parameter: `mailbox` (default: INBOX). + +* Returns: `200 MailboxStatusResponse` with `mailbox`, `total`, `unseen`, `recent`. + +### `POST /mail/{uid}/read` + +* Marks an email as read by its UID. + +* Query parameter: `mailbox` (default: INBOX). + +* Returns: `200 {"status": "success", "uid": "...", "action": "marked_read"}`. + +### `POST /mail/{uid}/unread` + +* Marks an email as unread by its UID. + +* Returns: `200 {"status": "success", "uid": "...", "action": "marked_unread"}`. + +### `POST /mail/{uid}/delete` + +* Deletes an email by its UID. + +* Returns: `200 {"status": "success", "uid": "...", "action": "deleted"}`. + +### `POST /mail/{uid}/move` + +* Moves an email to a different mailbox/folder. + +* Query parameter: `destination` (required), `source` (default: INBOX). + +* Returns: `200 {"status": "success", "uid": "...", "action": "moved", "destination": "..."}`. + +### `GET /mail/health` + +* Health/status of the interface including IMAP connectivity check. + diff --git a/concepts/knowledgebase/query-control.mdx b/concepts/knowledgebase/query-control.mdx new file mode 100644 index 0000000..2ae4ee0 --- /dev/null +++ b/concepts/knowledgebase/query-control.mdx @@ -0,0 +1,133 @@ +--- +title: "Query Control" +description: "Control whether KnowledgeBase context is injected into the agent via query_knowledge_base" +--- + +## Overview + +By default, when you pass a `KnowledgeBase` as `context` to a `Task`, the knowledge base is **set up** (documents are indexed) **and queried** automatically (`query_knowledge_base=True`). To disable automatic RAG retrieval, set `query_knowledge_base=False` on the Task. + +This gives you fine-grained control over when the agent receives knowledge base context, letting you: +- Disable RAG retrieval for specific tasks that don't need it +- Index documents once and query selectively across different tasks +- Mix knowledge-base-aware and knowledge-base-free tasks using the same agent + +## Usage + +### Default Behavior (Query Enabled) + +By default, `query_knowledge_base=True` retrieves relevant chunks and injects them into the agent's context: + +```python +from upsonic import Agent, Task, KnowledgeBase +from upsonic.embeddings import OpenAIEmbedding, OpenAIEmbeddingConfig +from upsonic.vectordb import ChromaProvider, ChromaConfig, ConnectionConfig, Mode +from upsonic.loaders.pdf import PdfLoader +from upsonic.loaders.config import PdfLoaderConfig + +embedding = OpenAIEmbedding(OpenAIEmbeddingConfig()) +vectordb = ChromaProvider(ChromaConfig( + collection_name="my_kb", + vector_size=1536, + connection=ConnectionConfig(mode=Mode.IN_MEMORY) +)) +loader = PdfLoader(PdfLoaderConfig()) + +kb = KnowledgeBase( + sources=["company_handbook.pdf"], + embedding_provider=embedding, + vectordb=vectordb, + loaders=[loader] +) + +agent = Agent("anthropic/claude-sonnet-4-5") + +# RAG context IS injected — agent sees relevant document chunks +task = Task( + description="What is the vacation policy?", + context=[kb], + query_knowledge_base=True +) + +result = agent.do(task) +print(result) +``` + +### Disabling Knowledge Base Query + +Set `query_knowledge_base=False` to skip querying. The knowledge base is still set up but the agent will not receive any RAG context: + +```python +# RAG context is NOT injected — KB is indexed but not queried +task = Task( + description="Write a creative story about space travel", + context=[kb], + query_knowledge_base=False +) + +result = agent.do(task) +print(result) +``` + +## Selective Querying Across Tasks + +A common pattern is to set up one knowledge base and use it across multiple tasks, querying it only when needed: + +```python +from upsonic import Agent, Task, KnowledgeBase +from upsonic.embeddings import OpenAIEmbedding, OpenAIEmbeddingConfig +from upsonic.vectordb import ChromaProvider, ChromaConfig, ConnectionConfig, Mode +from upsonic.loaders.pdf import PdfLoader +from upsonic.loaders.config import PdfLoaderConfig + +embedding = OpenAIEmbedding(OpenAIEmbeddingConfig()) +vectordb = ChromaProvider(ChromaConfig( + collection_name="docs_kb", + vector_size=1536, + connection=ConnectionConfig(mode=Mode.EMBEDDED, db_path="./kb_db") +)) +loader = PdfLoader(PdfLoaderConfig()) + +kb = KnowledgeBase( + sources=["product_docs/"], + embedding_provider=embedding, + vectordb=vectordb, + loaders=[loader] +) + +agent = Agent("anthropic/claude-sonnet-4-5") + +# Task 1: Needs KB context +task_with_rag = Task( + description="Summarize the product specifications", + context=[kb], + query_knowledge_base=True +) + +# Task 2: Does NOT need KB context +task_without_rag = Task( + description="Draft a marketing tagline for our product", + context=[kb], + query_knowledge_base=False +) + +result1 = agent.do(task_with_rag) # Uses RAG context +result2 = agent.do(task_without_rag) # No RAG context +``` + +## Combining with Vector Search Parameters + +When `query_knowledge_base=True`, you can fine-tune the retrieval using vector search parameters on the same Task: + +```python +task = Task( + description="What are the compliance requirements?", + context=[kb], + query_knowledge_base=True, + vector_search_top_k=10, + vector_search_similarity_threshold=0.75 +) +``` + +See the [Advanced](/concepts/knowledgebase/advanced) page for more details on vector search parameters. + diff --git a/concepts/memory/attributes.mdx b/concepts/memory/attributes.mdx index 95b2828..d158cf2 100644 --- a/concepts/memory/attributes.mdx +++ b/concepts/memory/attributes.mdx @@ -5,14 +5,33 @@ description: "Configuration options for the Memory class" ## Memory Class Parameters +### Save Flags + +Control what is persisted to storage after each run. + +| Parameter | Type | Default | Description | +|-----------|------|---------|-------------| +| `full_session_memory` | `bool` | `False` | Persist complete chat history to storage | +| `summary_memory` | `bool` | `False` | Generate and persist session summaries | +| `user_analysis_memory` | `bool` | `False` | Analyze and persist user trait profiles | + +### Load Flags + +Control what is injected into subsequent runs as context. Each defaults to its corresponding save flag. + +| Parameter | Type | Default | Description | +|-----------|------|---------|-------------| +| `load_full_session_memory` | `bool \| None` | `None` | Inject chat history into runs (defaults to `full_session_memory`) | +| `load_summary_memory` | `bool \| None` | `None` | Inject session summary into runs (defaults to `summary_memory`) | +| `load_user_analysis_memory` | `bool \| None` | `None` | Inject user profile into runs (defaults to `user_analysis_memory`) | + +### General Parameters + | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `storage` | `Storage` | (required) | Storage backend for persistence | | `session_id` | `str \| None` | `None` | Session identifier (auto-generated if not provided) | | `user_id` | `str \| None` | `None` | User identifier (auto-generated if not provided) | -| `full_session_memory` | `bool` | `False` | Enable complete chat history persistence | -| `summary_memory` | `bool` | `False` | Enable automatic session summaries | -| `user_analysis_memory` | `bool` | `False` | Enable user trait extraction | | `num_last_messages` | `int \| None` | `None` | Limit history to last N message turns | | `model` | `str \| Model \| None` | `None` | Model for summaries/user analysis | | `user_profile_schema` | `BaseModel \| None` | `None` | Custom Pydantic model for user profiles | @@ -41,12 +60,70 @@ memory = Memory( model="anthropic/claude-sonnet-4-5" ) -agent = Agent("anthropic/claude-sonnet-4-6", memory=memory) +agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) result = agent.do(Task("Hello! I'm learning Python")) print(result) ``` +## Save/Load Separation + +Save everything to storage but only inject summaries and user profiles into subsequent runs. +This reduces token usage while preserving full history for auditing or debugging. + +```python +from upsonic import Agent, Task +from upsonic.storage.memory import Memory +from upsonic.storage.sqlite import SqliteStorage + +storage = SqliteStorage(db_file="efficient.db") + +memory = Memory( + storage=storage, + session_id="session_001", + user_id="user_123", + full_session_memory=True, # Save raw chat history + summary_memory=True, # Save summaries + user_analysis_memory=True, # Save user profiles + load_full_session_memory=False, # Don't inject raw history into context + load_summary_memory=True, # Inject summary instead + load_user_analysis_memory=True, # Inject user profile + model="anthropic/claude-sonnet-4-5" +) + +agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) + +result1 = agent.do(Task("My name is Alice, I work on ML pipelines")) +result2 = agent.do(Task("What do you know about me?")) +print(result2) # Recalls via summary + user profile, not raw history +``` + +## Summary-Only Mode + +Use summaries without persisting full chat history: + +```python +from upsonic import Agent, Task +from upsonic.storage.memory import Memory +from upsonic.storage.sqlite import SqliteStorage + +storage = SqliteStorage(db_file="summary_only.db") + +memory = Memory( + storage=storage, + session_id="session_001", + full_session_memory=False, # No raw history saved + summary_memory=True, # Only summaries are saved and injected + model="anthropic/claude-sonnet-4-5" +) + +agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) + +result1 = agent.do(Task("The project deadline is next Friday")) +result2 = agent.do(Task("When is the deadline?")) +print(result2) # Recalls via summary +``` + ## Message Limiting Control memory size by limiting message history: @@ -99,7 +176,7 @@ memory = Memory( model="anthropic/claude-sonnet-4-5" ) -agent = Agent("anthropic/claude-sonnet-4-6", memory=memory) +agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) result = agent.do(Task("I'm John from Acme Corp, working as a data engineer")) print(result) @@ -121,7 +198,6 @@ from upsonic.storage.sqlite import SqliteStorage storage = SqliteStorage(db_file="profiles.db") -# Update mode - accumulates traits over time memory = Memory( storage=storage, session_id="session_001", @@ -131,7 +207,7 @@ memory = Memory( model="anthropic/claude-sonnet-4-5" ) -agent = Agent("anthropic/claude-sonnet-4-6", memory=memory) +agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) result = agent.do(Task("I prefer dark mode and use vim")) print(result) diff --git a/concepts/memory/choosing-right-memory-types.mdx b/concepts/memory/choosing-right-memory-types.mdx index 34f3250..545f8de 100644 --- a/concepts/memory/choosing-right-memory-types.mdx +++ b/concepts/memory/choosing-right-memory-types.mdx @@ -8,7 +8,7 @@ description: "Select the appropriate memory types for your use case" | Memory Type | Purpose | When to Use | |-------------|---------|-------------| | **Conversation Memory** | Full chat history | Multi-turn conversations, detailed context | -| **Summary Memory** | Condensed summaries | Long sessions | +| **Summary Memory** | Condensed summaries | Long sessions, cost-efficient recall | | **User Analysis Memory** | User profiles | Personalization, cross-session learning | ## Decision Guide @@ -37,9 +37,38 @@ result2 = agent.do(Task("I ordered it last week")) print(result2) # Agent remembers order context ``` -### Summary Memory +### Summary Memory Only -Best for: Long sessions, cost-conscious, overview needed +Best for: Long sessions where raw history is unnecessary, cost-conscious + +Summary memory works independently — no need to enable full session memory. +The agent recalls key facts through a generated summary instead of raw messages. + +```python +from upsonic import Agent, Task +from upsonic.storage.memory import Memory +from upsonic.storage.sqlite import SqliteStorage + +storage = SqliteStorage(db_file="notes.db") + +memory = Memory( + storage=storage, + session_id="notes_001", + full_session_memory=False, # No raw history + summary_memory=True, # Summary generated and injected + model="anthropic/claude-sonnet-4-5" +) + +agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) + +result1 = agent.do(Task("The project deadline is next Friday")) +result2 = agent.do(Task("When is the deadline?")) +print(result2) # Recalls via summary +``` + +### Conversation + Summary + +Best for: Long conversations with detailed context needed ```python from upsonic import Agent, Task @@ -51,12 +80,12 @@ storage = SqliteStorage(db_file="meetings.db") memory = Memory( storage=storage, session_id="meeting_001", - full_session_memory=True + full_session_memory=True, summary_memory=True, - model="anthropic/claude-sonnet-4-5" # Required for summaries + model="anthropic/claude-sonnet-4-5" ) -agent = Agent("anthropic/claude-sonnet-4-6", memory=memory) +agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) result1 = agent.do(Task("Let's discuss Q1 revenue targets")) result2 = agent.do(Task("Now let's cover hiring plans")) @@ -83,15 +112,15 @@ memory = Memory( model="anthropic/claude-sonnet-4-5" ) -agent = Agent("anthropic/claude-sonnet-4-6", memory=memory) +agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) result = agent.do(Task("I'm a backend developer who prefers Python")) print(result) # Agent learns about user ``` -### Conversation + Summary +### Conversation + Summary (Save-Only History) -Best for: Long conversations with detailed context needed +Best for: Long conversations where you want full history in storage for auditing but only inject summaries to save tokens ```python from upsonic import Agent, Task @@ -103,13 +132,15 @@ storage = SqliteStorage(db_file="tutoring.db") memory = Memory( storage=storage, session_id="lesson_001", - full_session_memory=True, - summary_memory=True, - num_last_messages=20, # Keep last 20 turns + summary + full_session_memory=True, # Save full history + summary_memory=True, # Save summaries + load_full_session_memory=False, # Don't inject raw history + load_summary_memory=True, # Inject summary only + num_last_messages=20, model="anthropic/claude-sonnet-4-5" ) -agent = Agent("anthropic/claude-sonnet-4-6", memory=memory) +agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) result = agent.do(Task("Let's continue learning Python functions")) print(result) @@ -137,13 +168,43 @@ memory = Memory( model="anthropic/claude-sonnet-4-5" ) -agent = Agent("anthropic/claude-sonnet-4-6", memory=memory) +agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) result1 = agent.do(Task("Hi! I'm a data scientist learning about LLMs")) result2 = agent.do(Task("What recommendations do you have for me?")) print(result2) # Personalized based on user profile ``` +### Token-Efficient Full Setup + +Best for: Production systems that need full data persistence but minimal token usage + +```python +from upsonic import Agent, Task +from upsonic.storage.memory import Memory +from upsonic.storage.sqlite import SqliteStorage + +storage = SqliteStorage(db_file="production.db") + +memory = Memory( + storage=storage, + session_id="session_001", + user_id="user_123", + full_session_memory=True, # Save everything + summary_memory=True, + user_analysis_memory=True, + load_full_session_memory=False, # Don't inject raw history + load_summary_memory=True, # Inject summary instead + load_user_analysis_memory=True, # Inject user profile + model="anthropic/claude-sonnet-4-5" +) + +agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) + +result = agent.do(Task("Continue where we left off")) +print(result) +``` + ## Use Case Examples | Use Case | Recommended Configuration | @@ -152,5 +213,6 @@ print(result2) # Personalized based on user profile | Meeting Notes | `summary_memory=True` | | Personal Assistant | All three memory types | | Quick Q&A | `full_session_memory=True` only | -| Learning Platform | `full_session_memory=True`, `summary_memory=True`, `user_analysis_memory=True` | +| Learning Platform | All three memory types | | Code Assistant | `full_session_memory=True`, `feed_tool_call_results=True` | +| High-Volume Production | All three save flags `True`, `load_full_session_memory=False`, `load_summary_memory=True` | diff --git a/concepts/memory/examples/basic-memory-example.mdx b/concepts/memory/examples/basic-memory-example.mdx index f1ea4da..083ca12 100644 --- a/concepts/memory/examples/basic-memory-example.mdx +++ b/concepts/memory/examples/basic-memory-example.mdx @@ -61,7 +61,6 @@ from upsonic import Agent, Task from upsonic.storage.memory import Memory from upsonic.storage.sqlite import SqliteStorage -# Same storage instance or connection to same db file storage = SqliteStorage(db_file="./support.db") # New session, same customer @@ -76,11 +75,43 @@ memory = Memory( agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) -# Agent knows customer from previous sessions +# Agent knows customer from previous sessions via user profile result = agent.do(Task("Hello, I'm back. What did we discuss before?")) print(result) ``` +## Token-Efficient Mode + +Save everything but only inject summaries and user profiles to reduce token usage: + +```python +from upsonic import Agent, Task +from upsonic.storage.memory import Memory +from upsonic.storage.sqlite import SqliteStorage + +storage = SqliteStorage(db_file="./efficient_support.db") + +memory = Memory( + storage=storage, + session_id="support_003", + user_id="customer_123", + full_session_memory=True, # Save raw history + summary_memory=True, # Save summaries + user_analysis_memory=True, # Save user profiles + load_full_session_memory=False, # Don't inject raw history + load_summary_memory=True, # Inject summary only + load_user_analysis_memory=True, # Inject user profile + model="anthropic/claude-sonnet-4-5" +) + +agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) + +result1 = agent.do(Task("I need help with billing")) +result2 = agent.do(Task("The charge was for $49.99 on March 15")) +result3 = agent.do(Task("What was the amount I mentioned?")) +print(result3) # Recalls via summary, full history saved in storage +``` + ## Async Usage ```python @@ -91,7 +122,7 @@ from upsonic.storage.sqlite import SqliteStorage async def main(): storage = SqliteStorage(db_file="./async_support.db") - + memory = Memory( storage=storage, session_id="async_001", @@ -100,13 +131,12 @@ async def main(): summary_memory=True, model="anthropic/claude-sonnet-4-5" ) - + agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) - - # Async execution + result1 = await agent.do_async(Task("My order hasn't arrived")) result2 = await agent.do_async(Task("It's been 5 days")) - + print("Final response:", result2) asyncio.run(main()) @@ -133,7 +163,8 @@ asyncio.run(main()) ## Key Takeaways -1. **Same storage, different sessions** - User profile persists -2. **Memory is automatic** - Just attach to agent, no manual saving +1. **Same storage, different sessions** - User profile persists across sessions +2. **Memory is automatic** - Just attach to agent, no manual saving needed 3. **Summary + History** - Use both for best context/cost balance -4. **Sync and async** - Both `do()` and `do_async()` work with memory +4. **Save/Load separation** - Save everything, inject only what's needed +5. **Sync and async** - Both `do()` and `do_async()` work with memory diff --git a/concepts/memory/memory-types/conversation-memory.mdx b/concepts/memory/memory-types/conversation-memory.mdx index 6238d73..1c83bdf 100644 --- a/concepts/memory/memory-types/conversation-memory.mdx +++ b/concepts/memory/memory-types/conversation-memory.mdx @@ -7,6 +7,15 @@ description: "Store complete chat history for maintaining context" Conversation Memory persists the complete chat history for a session, enabling agents to reference previous messages and maintain context across interactions. +## Save vs Load + +| Flag | Purpose | +|------|---------| +| `full_session_memory` | **Save**: Persist raw messages to storage | +| `load_full_session_memory` | **Load**: Inject message history into subsequent runs (defaults to `full_session_memory`) | + +You can save history without injecting it — useful when pairing with summary memory to reduce token usage while keeping a full audit trail. + ## Basic Usage ```python @@ -29,6 +38,33 @@ result2 = agent.do(Task("What's my name?")) print(result2) # "Your name is Alice" ``` +## Save-Only Mode + +Save full history for auditing but don't inject it into context: + +```python +from upsonic import Agent, Task +from upsonic.storage.memory import Memory +from upsonic.storage.sqlite import SqliteStorage + +storage = SqliteStorage(db_file="audit.db") + +memory = Memory( + storage=storage, + session_id="session_001", + full_session_memory=True, # Save raw history + load_full_session_memory=False, # Don't inject into context + summary_memory=True, # Use summary for context instead + load_summary_memory=True, + model="anthropic/claude-sonnet-4-5" +) + +agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) + +result = agent.do(Task("Continue our discussion")) +print(result) +``` + ## With Message Limiting Control memory size by limiting to the last N conversation turns: @@ -88,7 +124,8 @@ print(result2) # Can reference previous tool results | Parameter | Type | Default | Description | |-----------|------|---------|-------------| -| `full_session_memory` | `bool` | `False` | Enable conversation history | +| `full_session_memory` | `bool` | `False` | Save conversation history | +| `load_full_session_memory` | `bool \| None` | `None` | Inject history into runs (defaults to `full_session_memory`) | | `session_id` | `str` | auto-generated | Session identifier | | `num_last_messages` | `int \| None` | `None` | Limit to last N turns | | `feed_tool_call_results` | `bool` | `False` | Include tool outputs | diff --git a/concepts/memory/memory-types/focus-memory.mdx b/concepts/memory/memory-types/focus-memory.mdx index 0227300..3e10eaa 100644 --- a/concepts/memory/memory-types/focus-memory.mdx +++ b/concepts/memory/memory-types/focus-memory.mdx @@ -7,6 +7,13 @@ description: "Learn about users and build comprehensive profiles" User Analysis Memory extracts user traits from conversations and builds persistent profiles. This enables personalization across sessions and adaptive agent behavior. +## Save vs Load + +| Flag | Purpose | +|------|---------| +| `user_analysis_memory` | **Save**: Analyze conversations and persist user profiles | +| `load_user_analysis_memory` | **Load**: Inject user profile into subsequent runs (defaults to `user_analysis_memory`) | + ## Basic Usage ```python @@ -21,7 +28,7 @@ memory = Memory( session_id="session_001", user_id="user_abc", user_analysis_memory=True, - model="anthropic/claude-sonnet-4-5" # Required for trait extraction + model="anthropic/claude-sonnet-4-5" ) agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) @@ -31,6 +38,34 @@ result2 = agent.do(Task("What do you know about me?")) print(result2) # "You're a data scientist with 5 years of ML experience" ``` +## Standalone Usage + +User analysis memory works without full session memory. The agent saves and loads +user profiles independently: + +```python +from upsonic import Agent, Task +from upsonic.storage.memory import Memory +from upsonic.storage.sqlite import SqliteStorage + +storage = SqliteStorage(db_file="profiles.db") + +memory = Memory( + storage=storage, + session_id="session_001", + user_id="user_abc", + full_session_memory=False, # No raw history + user_analysis_memory=True, # Save user profile + model="anthropic/claude-sonnet-4-5" +) + +agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) + +result1 = agent.do(Task("I'm Alice, a Python developer who prefers dark mode")) +result2 = agent.do(Task("What do you know about me?")) +print(result2) # Recalls user profile without raw history +``` + ## Custom Profile Schema Define specific fields to track: @@ -126,6 +161,31 @@ result = agent2.do(Task("Remind me what you know about me")) print(result) # "You're John, you run a bakery in Seattle" ``` +## Save-Only Mode + +Save profiles for analytics without injecting them into the agent's context: + +```python +from upsonic import Agent, Task +from upsonic.storage.memory import Memory +from upsonic.storage.sqlite import SqliteStorage + +storage = SqliteStorage(db_file="analytics.db") + +memory = Memory( + storage=storage, + session_id="session_001", + user_id="user_789", + user_analysis_memory=True, # Save profiles + load_user_analysis_memory=False, # Don't inject into context + model="anthropic/claude-sonnet-4-5" +) + +agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) +result = agent.do(Task("I also speak Spanish and French")) +print(result) +``` + ## Update Modes | Mode | Behavior | @@ -140,7 +200,6 @@ from upsonic.storage.sqlite import SqliteStorage storage = SqliteStorage(db_file="modes.db") -# Update mode: accumulates traits memory = Memory( storage=storage, session_id="session_001", @@ -159,7 +218,8 @@ print(result) | Parameter | Type | Default | Description | |-----------|------|---------|-------------| -| `user_analysis_memory` | `bool` | `False` | Enable user profiling | +| `user_analysis_memory` | `bool` | `False` | Save user profiles | +| `load_user_analysis_memory` | `bool \| None` | `None` | Inject profiles into runs (defaults to `user_analysis_memory`) | | `user_id` | `str` | auto-generated | User identifier | | `model` | `str \| Model` | (required) | Model for trait extraction | | `user_profile_schema` | `BaseModel \| None` | `None` | Custom profile schema | diff --git a/concepts/memory/memory-types/summary-memory.mdx b/concepts/memory/memory-types/summary-memory.mdx index a3a0385..7afc113 100644 --- a/concepts/memory/memory-types/summary-memory.mdx +++ b/concepts/memory/memory-types/summary-memory.mdx @@ -5,9 +5,19 @@ description: "Maintain evolving conversation summaries for cost efficiency" ## Overview -Summary Memory generates and maintains an evolving summary of key conversation points. +Summary Memory generates and maintains an evolving summary of key conversation points. It works both alongside conversation memory and independently. -## Basic Usage +## Save vs Load + +| Flag | Purpose | +|------|---------| +| `summary_memory` | **Save**: Generate and persist session summaries | +| `load_summary_memory` | **Load**: Inject summary into subsequent runs (defaults to `summary_memory`) | + +## Standalone Usage + +Summary memory works without full session memory. The agent recalls context through +generated summaries instead of raw message history: ```python from upsonic import Agent, Task @@ -19,17 +29,16 @@ storage = SqliteStorage(db_file="summaries.db") memory = Memory( storage=storage, session_id="session_001", - full_session_memory=True - summary_memory=True, - model="anthropic/claude-sonnet-4-5" # Required for summary generation + full_session_memory=False, # No raw history + summary_memory=True, # Summary generated after each run + model="anthropic/claude-sonnet-4-5" ) -agent = Agent("anthropic/claude-sonnet-4-6", memory=memory) +agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) -result1 = agent.do(Task("Let's discuss Python web frameworks")) -result2 = agent.do(Task("How does Django compare to Flask?")) -result3 = agent.do(Task("What have we discussed so far?")) -print(result3) # Uses summary for context +result1 = agent.do(Task("My name is Alice and I work on project Falcon")) +result2 = agent.do(Task("What project am I working on?")) +print(result2) # Recalls via summary ``` ## Combined with Conversation Memory @@ -52,23 +61,53 @@ memory = Memory( model="anthropic/claude-sonnet-4-5" ) -agent = Agent("anthropic/claude-sonnet-4-6", memory=memory) +agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) result = agent.do(Task("Continue our database optimization discussion")) print(result) ``` +## Save History, Load Summary Only + +Save full history for auditing but only inject the summary to reduce token usage: + +```python +from upsonic import Agent, Task +from upsonic.storage.memory import Memory +from upsonic.storage.sqlite import SqliteStorage + +storage = SqliteStorage(db_file="efficient.db") + +memory = Memory( + storage=storage, + session_id="session_001", + full_session_memory=True, # Save raw history + summary_memory=True, # Save summaries + load_full_session_memory=False, # Don't inject raw history + load_summary_memory=True, # Inject summary only + model="anthropic/claude-sonnet-4-5" +) + +agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) + +result1 = agent.do(Task("Let's discuss Python web frameworks")) +result2 = agent.do(Task("How does Django compare to Flask?")) +result3 = agent.do(Task("What have we discussed so far?")) +print(result3) # Uses summary for context, full history saved in storage +``` + ## How It Works -1. After each completed run, the session summary is updated +1. After each completed run, the session summary is updated by a sub-agent 2. Summary includes key points, user preferences, and topics discussed -3. Summary is injected as context for subsequent interactions -4. Older message history can be replaced by summary to reduce costs +3. Summary is injected as context for subsequent interactions (when `load_summary_memory` is enabled) +4. Works independently of `full_session_memory` — can be the sole source of recall ## Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| -| `summary_memory` | `bool` | `False` | Enable summary generation | +| `summary_memory` | `bool` | `False` | Save and generate summaries | +| `load_summary_memory` | `bool \| None` | `None` | Inject summary into runs (defaults to `summary_memory`) | | `session_id` | `str` | auto-generated | Session identifier | | `model` | `str \| Model` | (required) | Model for generating summaries | diff --git a/concepts/memory/overview.mdx b/concepts/memory/overview.mdx index 0ea0302..4ad9287 100644 --- a/concepts/memory/overview.mdx +++ b/concepts/memory/overview.mdx @@ -9,6 +9,8 @@ hideToc: true Memory enables agents to remember conversations and learn about users across sessions. It maintains chat history, generates summaries, and builds user profiles for personalized interactions. +Memory separates **saving** from **loading**: save flags control what is persisted to storage, while load flags control what is injected into subsequent runs. This allows fine-grained control — for example, saving full chat history while only injecting summaries to reduce token usage. + ## Installation Memory requires a storage backend to persist data. Choose the storage option that fits your deployment needs. @@ -36,6 +38,7 @@ Memory requires a storage backend to persist data. Choose the storage option tha - **Conversation History**: Persist complete chat history across sessions - **Session Summaries**: Auto-generate condensed conversation summaries - **User Profiles**: Extract and learn user traits from interactions +- **Save/Load Separation**: Independently control what is saved vs. what is injected into context - **Multiple Storage Backends**: SQLite, Redis, PostgreSQL, MongoDB, or in-memory - **Sync & Async Support**: Both synchronous and asynchronous storage operations - **HITL Checkpointing**: Automatic checkpoint saving for Human-in-the-Loop resumption @@ -47,31 +50,51 @@ from upsonic import Agent, Task from upsonic.storage.memory import Memory from upsonic.storage.sqlite import SqliteStorage -# Create storage backend storage = SqliteStorage(db_file="memory.db") -# Create memory with desired features memory = Memory( storage=storage, session_id="session_001", user_id="user_123", - full_session_memory=True, # Enable chat history - summary_memory=True, # Enable summaries - user_analysis_memory=True, # Enable user profiles - model="anthropic/claude-sonnet-4-5" # Required for summaries & user analysis + full_session_memory=True, # Save chat history + summary_memory=True, # Save summaries + user_analysis_memory=True, # Save user profiles + model="anthropic/claude-sonnet-4-5" ) -# Attach memory to agent agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) -# First interaction result1 = agent.do(Task("My name is Alice and I'm a Python developer")) -# Second interaction - agent remembers context result2 = agent.do(Task("What's my name and expertise?")) print(result2) # Alice, Python developer ``` +## Save vs Load Flags + +Each memory type has a **save** flag and a **load** flag. By default, the load flag mirrors the save flag. + +| Save Flag | Load Flag | Purpose | +|-----------|-----------|---------| +| `full_session_memory` | `load_full_session_memory` | Chat history | +| `summary_memory` | `load_summary_memory` | Session summaries | +| `user_analysis_memory` | `load_user_analysis_memory` | User profiles | + +```python +memory = Memory( + storage=storage, + session_id="session_001", + user_id="user_123", + full_session_memory=True, # Save full chat history + summary_memory=True, # Save summaries + user_analysis_memory=True, # Save user profiles + load_full_session_memory=False, # Don't inject raw history + load_summary_memory=True, # Inject summary only + load_user_analysis_memory=True, # Inject user profile + model="anthropic/claude-sonnet-4-5" +) +``` + ## Memory Types | Type | Purpose | Requires | diff --git a/concepts/memory/storage/async-mem0.mdx b/concepts/memory/storage/async-mem0.mdx index 24e3048..51a274e 100644 --- a/concepts/memory/storage/async-mem0.mdx +++ b/concepts/memory/storage/async-mem0.mdx @@ -45,8 +45,8 @@ async def main(): agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) - result1 = await agent.ado(Task("My name is Alice")) - result2 = await agent.ado(Task("What's my name?")) + result1 = await agent.do_async(Task("My name is Alice")) + result2 = await agent.do_async(Task("What's my name?")) print(result2) # "Your name is Alice" asyncio.run(main()) diff --git a/concepts/memory/storage/async-mongo.mdx b/concepts/memory/storage/async-mongo.mdx index 606bfe7..476aaab 100644 --- a/concepts/memory/storage/async-mongo.mdx +++ b/concepts/memory/storage/async-mongo.mdx @@ -44,8 +44,8 @@ async def main(): agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) - result1 = await agent.ado(Task("My name is Alice")) - result2 = await agent.ado(Task("What's my name?")) + result1 = await agent.do_async(Task("My name is Alice")) + result2 = await agent.do_async(Task("What's my name?")) print(result2) # "Your name is Alice" asyncio.run(main()) diff --git a/concepts/memory/storage/async-postgres.mdx b/concepts/memory/storage/async-postgres.mdx index bc45a62..7535201 100644 --- a/concepts/memory/storage/async-postgres.mdx +++ b/concepts/memory/storage/async-postgres.mdx @@ -43,8 +43,8 @@ async def main(): agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) - result1 = await agent.ado(Task("My name is Alice")) - result2 = await agent.ado(Task("What's my name?")) + result1 = await agent.do_async(Task("My name is Alice")) + result2 = await agent.do_async(Task("What's my name?")) print(result2) # "Your name is Alice" asyncio.run(main()) diff --git a/concepts/memory/storage/async-sqlite.mdx b/concepts/memory/storage/async-sqlite.mdx index e2c18d0..40a5541 100644 --- a/concepts/memory/storage/async-sqlite.mdx +++ b/concepts/memory/storage/async-sqlite.mdx @@ -41,8 +41,8 @@ async def main(): agent = Agent("anthropic/claude-sonnet-4-5", memory=memory) - result1 = await agent.ado(Task("My name is Alice")) - result2 = await agent.ado(Task("What's my name?")) + result1 = await agent.do_async(Task("My name is Alice")) + result2 = await agent.do_async(Task("What's my name?")) print(result2) # "Your name is Alice" asyncio.run(main()) diff --git a/concepts/tools/overview.mdx b/concepts/tools/overview.mdx index 58e1c2d..a7f6952 100644 --- a/concepts/tools/overview.mdx +++ b/concepts/tools/overview.mdx @@ -64,7 +64,7 @@ print("Result:", result) Apify, Firecrawl, Exa — web scraping and data extraction - + E2B, Daytona — secure cloud sandboxes for code execution, shell commands, and git operations diff --git a/docs.json b/docs.json index 711fb4c..77aa75c 100644 --- a/docs.json +++ b/docs.json @@ -465,6 +465,7 @@ "group": "Usage", "pages": [ "concepts/knowledgebase/putting-files", + "concepts/knowledgebase/query-control", "concepts/knowledgebase/using-as-tool", "concepts/knowledgebase/examples" ], @@ -749,7 +750,8 @@ "concepts/interfaces/slack", "concepts/interfaces/whatsapp", "concepts/interfaces/telegram", - "concepts/interfaces/gmail" + "concepts/interfaces/gmail", + "concepts/interfaces/mail" ], "collapsed": false }, diff --git a/integrations/overview.mdx b/integrations/overview.mdx index 073cd26..bc80ede 100644 --- a/integrations/overview.mdx +++ b/integrations/overview.mdx @@ -358,6 +358,9 @@ Web search tools your agents can use to find information. } href="/concepts/tools/search-tools/tavily" title="Tavily"> Search API built for AI agents. + } href="/concepts/tools/scraping-tools/exa" title="Exa"> + Neural & keyword search, URL contents, answers with citations. + ## Tools — Data @@ -381,6 +384,22 @@ Web scraping and data extraction toolkits. } href="/concepts/tools/scraping-tools/firecrawl" title="Firecrawl"> AI-ready web crawling. + } href="/concepts/tools/scraping-tools/exa" title="Exa"> + Fetch page contents, similar pages, and search-backed extraction. + + + +## Tools — Sandbox + +Secure cloud sandboxes for code execution, shell commands, and git operations. + + + + Cloud sandboxes for code and shells. + + + Isolated dev environments as tools. + ## Tools — Model Provider diff --git a/reference/schemas/data-models.mdx b/reference/schemas/data-models.mdx index 2f98628..bdf81b2 100644 --- a/reference/schemas/data-models.mdx +++ b/reference/schemas/data-models.mdx @@ -30,7 +30,7 @@ This is the atomic unit that will be converted into a vector and stored in the v | Parameter | Type | Default | Description | | ---------------------------------- | ----------------------------------------------------- | ---------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `text_content` | `str` | - | The actual text content of this specific chunk | -| `metadata` | `Dict[str, Any]` | - | Metadata inherited from the parent Document and potentially augmented with chunk-specific info, e.g., {'source': 'resume1.pdf', 'page_number': 3} | +| `metadata` | `Dict[str, Any]` | - | Metadata inherited from the parent Document and potentially augmented with chunk-specific info, e.g., `{'source': 'resume1.pdf', 'page_number': 3}` | | `document_id` | `str` | - | Document ID | | `chunk_id` | `str` | `str(uuid.uuid4())` | A unique identifier for this specific chunk | | `start_index` | `Optional[int]` | `None` | The start index of the chunk in the original document |