Full documentation is now at docs.cascadeflow.dev — the Mintlify-powered docs site is the primary reference for cascadeflow's agent runtime intelligence layer. The guides below remain for quick reference and deep links.
Agent runtime intelligence layer — optimize cost, latency, quality, budget, compliance, and energy across AI agent workflows. In-process harness, not a proxy.
- Quickstart - Get started with cascadeflow in 5 minutes
- Python Harness Quickstart -
init,run, and@agentfor in-process policy control - Providers - Configure and use different AI providers (OpenAI, Anthropic, Groq, Ollama, etc.)
- Presets - Use built-in presets for common use cases
- Gateway Server - Drop-in OpenAI/Anthropic-compatible endpoint for existing apps
- Streaming - Stream responses from cascade agents
- Tools - Function calling and tool usage with cascades
- Agentic Patterns (Python) - Tool loops and multi-agent orchestration in Python
- Agentic Patterns (TypeScript) - Tool loops, multi-agent orchestration, and message best practices
- Harness Telemetry & Privacy - Decision traces, callbacks, and privacy-safe observability
- Cost Tracking - Track and analyze API costs across queries
- Proxy Routing - Route requests through provider-aware proxy plans
- Production Guide - Best practices for production deployments
- Performance Guide - Optimize cascade performance and latency
- FastAPI Integration - Integrate cascadeflow with FastAPI applications
- Custom Cascades - Build custom cascade strategies
- Custom Validation - Implement custom quality validators
- Edge Device Deployment - Deploy cascades on edge devices (Jetson, etc.)
- Browser/Edge Runtime - Run cascades in browser or edge environments
- LangChain Integration - Callback handler for LangChain/LangGraph with harness-aware cascading
- OpenAI Agents SDK Integration - Harness-aware model provider for existing OpenAI Agents apps
- CrewAI Integration - Hook-based harness metrics + budget gating (opt-in)
- Google ADK Integration - Plugin-based harness integration for ADK runners (opt-in)
- n8n Integration - Use cascadeflow in n8n workflows
- Paygentic Integration - Usage metering and billing lifecycle helpers (opt-in)
Comprehensive working code samples:
Python Examples: examples/
- Basic usage, preset usage, multi-provider
- Tool execution, streaming, cost tracking
- Production patterns, FastAPI integration
- Edge device deployment, vLLM integration
- Custom cascades and validation
TypeScript Examples: packages/core/examples/
- Basic usage, tool calling, multi-provider
- Streaming responses
- Production patterns
- Browser/Vercel Edge deployment
- 📖 GitHub Discussions - Q&A and community support
- 🐛 GitHub Issues - Bug reports and feature requests
- 📧 Email Support - Direct support
Comprehensive API documentation for all classes and methods:
- API Overview - Complete API reference for Python and TypeScript
- Python API
- CascadeAgent - Main agent class
- ModelConfig - Model and cascade configuration
- CascadeResult - Result object with 30+ diagnostic fields
- TypeScript API
- See TypeScript Package for API documentation
See also: Comprehensive examples in /examples directory
For contributors and advanced users:
- Architecture Guide - Detailed architecture, data flow, and code organization
- Contributing Guide - How to contribute to cascadeflow
The architecture guide covers:
- Directory structure (monorepo layout)
- Core components and design patterns
- Data flow and execution paths
- Adding new providers, quality checks, and routing strategies
- Testing strategy and development workflow