Docs: Add image and file upload handling guide for n8n workflows

sveneisenschmidt · sveneisenschmidt · commit 99500f09bc03 · 2025-11-01T15:09:04.000+01:00
- Document how clients send images (base64-encoded in messages) - Explain token management with detail parameter (low/high/auto) - Provide 3 common approaches for processing images in n8n - Add practical code example for setting detail parameter - Clarify that bridge passes through, n8n handles processing Related to: #46
diff --git a/docs/N8N_SETUP.md b/docs/N8N_SETUP.md
@@ -150,6 +150,89 @@ Webhook → Extract Chat Body → AI Agent → (Response)
 
 The Webhook node receives the entire HTTP request in `$json`, with the actual chat data nested in `$json.body`. The "Extract Chat Body" node (a Set node) extracts this nested data and flattens it, making fields like `chatInput` and `sessionId` directly accessible to the AI Agent.
 
+## Handling Images and File Uploads
+
+The bridge passes file uploads and images through to n8n as-is. Your workflow must handle image extraction and processing.
+
+### How Clients Send Images
+
+Clients like LibreChat and Open-webUI send images embedded in the message content as base64-encoded data:
+
+```json
+{
+  "messages": [
+    {
+      "role": "user",
+      "content": [
+        {"type": "text", "text": "What's in this image?"},
+        {
+          "type": "image_url",
+          "image_url": {
+            "url": "data:image/jpeg;base64,/9j/4AAQSkZJRg...",
+            "detail": "low"
+          }
+        }
+      ]
+    }
+  ]
+}
+```
+
+### Token Management for Images
+
+The `detail` parameter controls how many tokens the image uses:
+
+- `"low"` - 65 tokens per image (recommended)
+- `"high"` - 65-765 tokens depending on image size
+- `"auto"` - LLM provider decides (default, usually "high")
+
+**Important:** If clients don't set `detail`, images may consume 500+ tokens and cause "too many tokens" errors. You can add the `detail` parameter in your n8n workflow before passing to the LLM.
+
+### Processing Images in n8n
+
+Your workflow needs to extract and handle images from the `messages` array. Common approaches:
+
+**Option 1: Pass through to LLM (if supported)**
+- Extract entire `messages` array
+- Forward to LLM that supports vision (GPT-4V, Claude 3, etc.)
+- Ensure `detail: "low"` is set to avoid token issues
+
+**Option 2: Upload to storage**
+- Extract base64 image data
+- Upload to S3, Cloudinary, or similar
+- Replace base64 with public URL
+- Forward modified messages to LLM
+
+**Option 3: Process locally**
+- Extract image for analysis
+- Use separate vision API/service
+- Combine results with chat response
+
+### Example: Setting detail Parameter in n8n
+
+Use a Code node to ensure all images use `detail: "low"`:
+
+```javascript
+const messages = $json.messages || [];
+
+const processedMessages = messages.map(msg => {
+  if (Array.isArray(msg.content)) {
+    msg.content = msg.content.map(item => {
+      if (item.type === 'image_url' && item.image_url) {
+        // Set detail to low if not specified
+        if (!item.image_url.detail) {
+          item.image_url.detail = 'low';
+        }
+      }
+      return item;
+    });
+  }
+  return msg;
+});
+
+return { messages: processedMessages };
+```
+
 ## Testing Your Workflow
 
 1. Save and activate the workflow in n8n