small sam is an intelligent Fantasy Premier League assistant powered by multi-agent AI systems, providing data-driven insights, transfer recommendations, and conversational team management.
Small Sam leverages advanced data orchestration, vector search, and multi-agent AI frameworks to deliver comprehensive Fantasy Premier League analysis and recommendations. The system aggregates player statistics, news sentiment, and tactical insights to provide actionable intelligence for FPL managers.
- π€ Conversational AI Interface: Natural language querying for player analysis and transfer strategies
- π Dynamic Dashboards: Real-time player statistics, form analysis, and league comparisons
- π― Transfer Intelligence: AI-powered recommendations based on form, fixtures, and value metrics
- π° News Integration: Semantic search across injury reports, transfer rumors, and match analysis
- π₯ Team Management: Comprehensive FPL team tracking with historical performance analysis
The FPL Assistant implements a microservices architecture with clear separation of concerns:
graph TB
A[RSS Feeds] --> B[Data Orchestration Service]
C[FBref Stats] --> B
D[FPL API] --> B
B --> E[PostgreSQL]
B --> F[Qdrant Vector DB]
G[FastAPI Backend] --> E
G --> F
G --> H[Small Sam Agent Framework]
H --> I[Stats Agent]
H --> J[News Retrieval Agent]
H --> K[Reporter Agent]
H --> L[Reviewer Agent]
M[Next.js Frontend] --> G
| Component | Technology | Purpose |
|---|---|---|
| Data Orchestration | Python + Celery | RSS scraping, FBref extraction, data enrichment |
| Agent Framework | LangGraph + OpenAI | Multi-agent reasoning and query processing |
| Backend API | FastAPI + SQLAlchemy | REST API and business logic coordination |
| Frontend | Next.js + TypeScript | User interface and state management |
| Primary Database | PostgreSQL 15+ | Structured data storage with JSONB support |
| Vector Store | Qdrant | Semantic search for news and match reports |
| Cache Layer | Redis 7+ | Multi-tier caching with intelligent TTL |
- Docker & Docker Compose: 20.10+
- Python: 3.11+ (for local development)
- Node.js: 18+ (for frontend development)
- Git: 2.30+
-
Clone the repository
git clone https://github.com/ribhu97/smallsam.git cd smallsam -
Environment Configuration
cp .env.example .env # Edit .env with your configuration -
Infrastructure Startup
docker-compose up -d postgres redis qdrant
-
Backend Service
cd backend python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate pip install -r requirements.txt alembic upgrade head uvicorn app.main:app --reload --port 8000
-
Data Orchestration Service
cd data-orchestration python -m venv venv source venv/bin/activate pip install -r requirements.txt celery -A app.celery worker --loglevel=info
-
Frontend Service
cd frontend npm install npm run dev -
Agent Framework
cd agent-framework python -m venv venv source venv/bin/activate pip install -r requirements.txt python -m app.main
# Build and deploy all services
docker-compose -f docker-compose.prod.yml up -d
# Initialize database schema
docker-compose exec backend alembic upgrade head
# Verify deployment
curl http://localhost/api/v1/healthThe system integrates multiple data sources with sophisticated error handling and rate limiting:
FPL Official API
- Player statistics and pricing
- Gameweek fixtures and results
- League standings and transfers
FBref Statistical Data
- Advanced player metrics (xG, xA, progressive passes)
- Team tactical analysis
- Historical performance trends
RSS News Feeds
- Injury and transfer news
- Match reports and tactical analysis
- Manager press conferences
# Simplified data pipeline workflow
RSS_FEEDS -> Content_Enrichment -> Vector_Embedding -> Qdrant
FPL_API -> Data_Validation -> PostgreSQL
FBREF -> Statistical_Processing -> PostgreSQLThe system implements a three-tier caching architecture:
| Cache Level | Technology | TTL Strategy | Use Case |
|---|---|---|---|
| L1 - Application | In-Memory | 5 minutes | Frequent API responses |
| L2 - Distributed | Redis | 15-60 minutes | Cross-request data sharing |
| L3 - Database | PostgreSQL | 24 hours | Expensive aggregations |
The Small Sam framework implements a cooperative multi-agent system with specialized responsibilities:
Orchestration Agent
- Query interpretation and task decomposition
- Agent coordination and context management
- Response synthesis and user interaction
Stats Agent
- Natural language to PostgreSQL query translation
- Statistical analysis and trend identification
- Player comparison and ranking operations
News Retrieval Agent
- Vector similarity search in Qdrant
- Contextual filtering and relevance scoring
- Injury and transfer intelligence aggregation
Reporter Agent
- Natural language response generation
- Data visualization recommendations
- Actionable insight synthesis
Reviewer Agent
- Response accuracy validation using source attribution
- Hallucination detection and quality scoring
- Final output approval and user safety
# Example agent interaction flow
user_query = "Who should I captain this gameweek?"
orchestrator = OrchestrationAgent()
context = orchestrator.parse_query(user_query)
stats_data = await stats_agent.analyze_captaincy_options(context)
news_context = await news_agent.get_injury_updates(context)
recommendation = await reporter_agent.generate_recommendation(
stats_data, news_context
)
final_response = await reviewer_agent.validate_and_approve(
recommendation, source_attribution=True
)| Metric | Target | Current |
|---|---|---|
| Concurrent Users | 100+ | 50 |
| API Response Time | <2s (95th percentile) | 1.2s |
| Agent Query Time | <5s (95th percentile) | 3.8s |
| Database Query Time | <500ms (95th percentile) | 280ms |
| System Uptime | 99.5% | 99.2% |
Minimum System Requirements
- CPU: 4 cores, 2.4GHz
- Memory: 8GB RAM
- Storage: 50GB SSD
- Network: 100Mbps
Recommended Production Setup
- CPU: 8 cores, 3.2GHz
- Memory: 16GB RAM
- Storage: 200GB NVMe SSD
- Network: 1Gbps
The project maintains strict test coverage standards:
- Unit Tests: >85% coverage across all modules
- Integration Tests: API endpoints and database interactions
- End-to-End Tests: Critical user workflows
- Load Tests: Performance validation under concurrent load
# Backend unit tests
cd backend
pytest --cov=app --cov-report=html
# Frontend tests
cd frontend
npm run test:coverage
# Integration tests
docker-compose -f docker-compose.test.yml up --abort-on-container-exit
# Load testing
cd load-tests
k6 run --vus 50 --duration 5m api-load-test.js# Generate test fixtures
python scripts/generate_test_data.py
# Database test migrations
alembic -c alembic.test.ini upgrade head
# Vector store test data
python scripts/populate_test_vectors.py| Variable | Description | Default | Required |
|---|---|---|---|
DATABASE_URL |
PostgreSQL connection string | - | β |
REDIS_URL |
Redis connection string | redis://localhost:6379 |
β |
QDRANT_URL |
Qdrant vector database URL | http://localhost:6333 |
β |
OPENAI_API_KEY |
OpenAI API key for agents | - | β |
FPL_API_BASE_URL |
FPL API endpoint | https://fantasy.premierleague.com/api/ |
β |
LOG_LEVEL |
Logging verbosity | INFO |
β |
ENVIRONMENT |
Deployment environment | development |
β |
Database Connection Pooling
# SQLAlchemy configuration
engine = create_engine(
DATABASE_URL,
pool_size=20,
max_overflow=30,
pool_pre_ping=True,
pool_recycle=3600
)Redis Configuration
# Redis connection with retry logic
redis_client = Redis.from_url(
REDIS_URL,
retry_on_timeout=True,
health_check_interval=30,
socket_keepalive=True
)Team Management
GET /api/v1/users/{user_id}/team
GET /api/v1/users/{user_id}/gameweeks
GET /api/v1/users/{user_id}/leaguesPlayer Data
GET /api/v1/players
GET /api/v1/players/{player_id}/stats
GET /api/v1/players/search?q={query}Conversational Interface
POST /api/v1/chat/query
GET /api/v1/chat/sessions/{session_id}
DELETE /api/v1/chat/sessions/{session_id}# Player analysis
curl -X POST "http://localhost:8000/api/v1/chat/query" \
-H "Content-Type: application/json" \
-d '{"query": "Who is the best value midfielder under Β£6m?"}'
# Transfer strategy
curl -X POST "http://localhost:8000/api/v1/chat/query" \
-H "Content-Type: application/json" \
-d '{"query": "Should I use my wildcard this gameweek?"}'# Service health validation
curl http://localhost:8000/api/v1/health
# Database connectivity
curl http://localhost:8000/api/v1/health/database
# Agent framework status
curl http://localhost:8000/api/v1/health/agentsThe system exports metrics in Prometheus format:
- Request Latency: API response time percentiles
- Error Rates: HTTP 4xx/5xx error frequencies
- Agent Performance: Query success rates and processing time
- Resource Utilization: CPU, memory, and database connection pools
# Structured logging with correlation IDs
logger.info(
"Agent query processed",
extra={
"correlation_id": "req_123456",
"user_id": "fpl_789",
"agent": "stats_agent",
"processing_time_ms": 1250,
"success": True
}
)We welcome contributions from the FPL and software development community! Please read our CONTRIBUTING.md for detailed guidelines on:
- Code style and formatting standards
- Testing requirements and procedures
- Pull request workflow and review process
- Issue reporting and feature requests
This project is licensed under the MIT License - see the LICENSE file for details.
- Fantasy Premier League: Official API and statistical data
- FBref: Advanced football statistics and analytics
- LangGraph: Multi-agent framework capabilities
- Qdrant: Vector database performance and reliability
- Documentation: Wiki
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Discord: Community Server
β‘ Built with modern AI and data engineering practices for the FPL community