A tiny, self-hosted reliability layer for any HTTP API or LLM.
Retries, circuit breaker, cache, idempotency, cost caps, streaming — in one Docker container.
ReliAPI sits between your application and external APIs, adding reliability features that prevent failures and control costs. Works with any HTTP API (REST, payment gateways, SaaS) and any LLM provider (OpenAI, Anthropic, Mistral).
One Docker container. One config file. One unified API.
- 🔑 First-class idempotency — request coalescing prevents duplicate charges
- 💸 Predictable costs — soft/hard budget caps prevent surprise LLM bills
- 🚀 Universal proxy — same reliability features for HTTP and LLM APIs
- 📦 Minimal & self-hosted — no SaaS lock-in, full control over your data
Unlike LLM-only gateways (LiteLLM, Portkey), ReliAPI handles both HTTP and LLM requests. Unlike feature-heavy platforms, ReliAPI stays minimal and focused on reliability.
- 🔄 Retries — exponential backoff with jitter
- ⚡ Circuit breaker — automatic failure detection per target
- 💾 Cache — TTL-based caching for GET/HEAD and LLM responses
- 🔑 Idempotency — request coalescing prevents duplicate execution
- 💰 Budget caps — soft (throttle) and hard (reject) cost limits
- 📡 Streaming — Server-Sent Events (SSE) for LLM responses
- 📊 Observability — Prometheus metrics and structured JSON logging
docker run -d \
-p 8000:8000 \
-e REDIS_URL=redis://localhost:6379/0 \
-e OPENAI_API_KEY=sk-... \
-v $(pwd)/config.yaml:/app/config.yaml \
ghcr.io/kikuai-lab/reliapi:latestCreate config.yaml:
targets:
openai:
base_url: https://api.openai.com/v1
llm:
provider: openai
default_model: gpt-4o-mini
soft_cost_cap_usd: 0.10
hard_cost_cap_usd: 0.50
cache:
enabled: true
ttl_s: 3600
auth:
type: bearer_env
env_var: OPENAI_API_KEYimport httpx
response = httpx.post(
"http://localhost:8000/proxy/http",
headers={"Idempotency-Key": "req-123"},
json={
"target": "my-api",
"method": "GET",
"path": "/users/123"
}
)
result = response.json()
print(result["data"]) # Response from upstream
print(result["meta"]["cache_hit"]) # True if cachedimport httpx
response = httpx.post(
"http://localhost:8000/proxy/llm",
headers={"Idempotency-Key": "chat-123"},
json={
"target": "openai",
"messages": [{"role": "user", "content": "Hello!"}],
"model": "gpt-4o-mini"
}
)
result = response.json()
print(result["data"]["content"]) # LLM response
print(result["meta"]["cost_usd"]) # Estimated cost| Feature | HTTP APIs | LLM APIs |
|---|---|---|
| Retries | ✅ | ✅ |
| Circuit breaker | ✅ | ✅ |
| Cache | ✅ | ✅ |
| Idempotency | ✅ | ✅ |
| Budget caps | ❌ | ✅ |
| Streaming | ❌ | ✅ (OpenAI) |
| Fallback chains | ❌ | ✅ |
- 📖 Full Documentation — complete guides and examples
- 🔑 Idempotency — request coalescing deep-dive
- 💰 Budget Caps — cost control examples
- 📡 Streaming — SSE events and examples
- ⚡ Circuit Breaker — failure detection
- 📊 Observability — metrics, logs, Grafana
- 🏗️ Architecture — system design
- 🚀 Deployment — production setup
- 🏢 Multi-Tenant — enterprise features
- 📚 API Reference — endpoint details
| Feature | ReliAPI | LiteLLM | Portkey | Helicone |
|---|---|---|---|---|
| Self-hosted | ✅ | ✅ | ✅ | ❌ |
| HTTP + LLM | ✅ | ❌ | ❌ | ❌ |
| Idempotency | ✅ First-class | ❌ | ❌ | |
| Budget caps | ✅ | ✅ | ✅ | |
| Minimal | ✅ | ❌ | ❌ | ❌ |
MIT License - see LICENSE file for details.
👉 Live Demo
👉 Documentation
👉 Issue Tracker
ReliAPI — Reliability layer for HTTP and LLM calls. Simple, predictable, stable.
Made with ❤️ by KikuAI Lab