Concise, modular system to generate weekly data reports directly from ClickHouse + dbt docs using OpenAI's Code Interpreter (Python tool).
The model receives raw tables (CSV) plus neutral context (schema, meta, optional dbt docs), then decides how to analyze, visualize, and summarize — no precomputed metrics. Each run produces:
- Per-metric HTML reports with BD-friendly narrative and plots.
- A unified weekly report HTML that synthesizes findings across all metrics.
- Cross-metric analysis discovering relationships and ecosystem patterns.
- Free-form analysis: Model runs Python in a sandbox (Responses API +
code_interpreter), no fixed toolchain. - Parallel processing: Process multiple metrics concurrently with
--max-workersflag for faster execution. - Structured output + validation: LLM returns structured JSON with key numbers and statistical evidence, automatically validated to ensure accuracy and prevent over-interpretation.
- Cross-metric analysis: Discovers correlations, patterns, and relationships across metrics to provide ecosystem-level insights.
- Enhanced significance detection: Strict criteria prevent false positives - only truly noteworthy changes are reported.
- Cost-efficient: No automatic retries, optimized prompts, and parallel processing minimize API costs.
- Raw inputs: Time series (date, value, optional label) and snapshots (value, optional change_pct, optional label).
- Neutral context: Attaches
*.schema.json(dtypes/examples/roles),*.meta.json(coverage/kind), optional*.docs.md, and model catalog for discovery. - Evidence-first visuals: Weekly total + WoW movers plots required by the prompt for time series.
- Static HTML: Dark-themed per-metric pages + a summary page, all assets referenced via relative paths.
report_agent/
cli/
main.py # CLI entrypoint (installed as `report-agent`)
connectors/
db/
clickhouse_connector.py # read-only ClickHouse client
llm/
base.py # abstract LLMConnector
openai.py # OpenAICodeInterpreterConnector (Responses API + CI)
metrics/
metrics.yml # metric list + kind + history_days
metrics_loader.py # fetch_time_series() and fetch_snapshot()
metrics_registry.py # loads metrics.yml, helpers (list_models, get_kind, etc.)
dbt_context/
from_docs_json.py # optional dbt manifest/docs ingestion
nlg/
prompt_builder.py # builds CI prompts (time_series vs snapshot)
html_report.py # render per-metric HTML
report_service.py # run CI -> download plots -> save CSV/text -> HTML
summary_service.py # unified weekly report synthesizing all metrics
cross_metric_service.py # cross-metric correlation analysis
templates/
ci_report_prompt.j2 # time-series CI prompt (weekly report + required plots)
ci_snapshot_prompt.j2 # snapshot CI prompt (single KPI)
report_page.html.j2 # per-metric HTML template (dark theme)
summary_prompt.j2 # weekly report LLM prompt
summary_page.html.j2 # weekly report HTML template
utils/
config_loader.py # loads .env and returns config dict
Output structure after a run (default reports/):
reports/
2025-12-17_api_p2p_discv4_clients_daily.html
2025-12-17_api_execution_transactions_active_accounts_by_sector_daily.html
index.html # unified weekly report (main entry point)
plots/
api_p2p_discv4_clients_daily_headline_weekly.png
api_p2p_discv4_clients_daily_top5_wow.png
...
data/
api_p2p_discv4_clients_daily.csv
api_execution_transactions_active_accounts_by_sector_daily.csv
text/
api_p2p_discv4_clients_daily.txt
api_execution_transactions_active_accounts_by_sector_daily.txt
structured/
api_p2p_discv4_clients_daily.json # structured output with validation
cross_metric_insights.json # cross-metric analysis results (if multiple metrics)
- Python 3.10+
- Install package (from repo root):
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -e .Dependencies (via pyproject.toml) include: openai, clickhouse-connect, jinja2, markdown, httpx, pandas, etc.
Set credentials in .env (loaded by utils/config_loader.py):
# OpenAI
OPENAI_API_KEY=...
# Optional overrides
# OPENAI_MODEL=gpt-4.1
# OPENAI_SUMMARY_MODEL=gpt-4.1-mini
# ClickHouse
CLICKHOUSE_HOST=...
CLICKHOUSE_USER=...
CLICKHOUSE_PASSWORD=...
CLICKHOUSE_DB_READ=dbt
CLICKHOUSE_DB_WRITE=playground_max
CLICKHOUSE_SECURE=true
# dbt docs / manifest (optional)
DBT_MANIFEST_PATH=/path/to/manifest.json
# or
# DBT_DOCS_BASE_URL=https://your-dbt-docs-root/
# Custom base (optional; e.g., Azure/OpenAI-compatible)
# OPENAI_BASE_URL=https://.../v1Define metrics in report_agent/metrics/metrics.yml:
metrics:
# Time-series with optional label dimension
- model: api_p2p_discv4_clients_daily
kind: time_series
history_days: 180
- model: api_execution_transactions_active_accounts_by_sector_daily
kind: time_series
history_days: 180
# Snapshot / number display
- model: api_execution_transactions_active_accounts_7d
kind: snapshotConventions:
kind: time_series: table has a date column; value is the main measure; label (if present) is a dimension.kind: snapshot: no date required; typically value, optional change_pct, optional label.
- CLI loads config + registry:
report-agentusesload_configs()andMetricsRegistryto discover metrics, kinds, and history windows. Metrics can be processed sequentially or in parallel (default: 3 workers). - Fetch data from ClickHouse:
- Time series:
fetch_time_series(model, lookback_days)queries bydate >= today() - INTERVAL .... - Snapshots:
fetch_snapshot(model)selects the whole table (no date filter).
- Time series:
- Prepare CI inputs: For each metric, the connector writes:
{model}.csv— raw rows{model}.schema.json— dtypes + sample values + simple roles (time/measure/dimension/delta){model}.meta.json— counts, (optional) date range, kind{model}.docs.md— dbt model + column docs (if available)model_catalog.json— full catalog of all available models (for discovery)
- Run Code Interpreter (Per-Metric Analysis):
- Files uploaded to OpenAI as container files.
build_ci_prompt()selects the correct prompt template (time series vs snapshot) with strict significance criteria.- Responses API runs with
code_interpreterand returns:- Structured JSON with significance, confidence, key numbers, and statistical evidence
- BD-facing narrative text
- Significance is validated against actual data to prevent over-interpretation.
- Persist outputs:
- Narrative →
reports/text/<model>.txt - Structured output →
reports/structured/<model>.json - Plots → downloaded via
download_artifacts()intoreports/plots/ - Data →
reports/data/<model>.csv - Per-metric HTML →
html_report.render_html_report()writesYYYY-MM-DD_<model>.html
- Narrative →
- Cross-metric analysis (if multiple metrics):
cross_metric_service.generate_cross_metric_analysis()analyzes relationships between all processed metrics- Discovers correlations, ecosystem patterns, and contradictions
- Uses catalog to suggest related metrics for future analysis
- Results saved to
reports/cross_metric_insights.json
- Weekly report synthesis:
- Uses
summary_service.generate_weekly_report()on the collected metric texts + structured data + cross-metric insights - Filters low-confidence findings automatically
- LLM synthesizes findings across all metrics with ecosystem context
- Creates unified weekly report saved as
reports/index.html
- Uses
- Install and configure:
pip install -e . # create and fill .env
- Define metrics in
report_agent/metrics/metrics.yml. - Run all metrics + summary (with parallel processing):
report-agent
- Run with custom number of parallel workers:
report-agent --max-workers 5
- Run a single metric:
report-agent --metric api_p2p_discv4_clients_daily
- Change output directory / skip summary:
report-agent --out-dir gnosis_reports --no-summary
- Enable verbose logging:
report-agent --verbose
Each metric analysis returns structured JSON with:
- Significance assessment: HIGH/MEDIUM/LOW/NONE based on strict criteria
- Confidence level: high/medium/low
- Key numbers: validated values (last week, previous week, change %, averages)
- Statistical evidence: standard deviations, normal variation checks, trend analysis
The system validates that significance claims are justified by the actual data, preventing false positives and over-interpretation.
After all metrics are analyzed, the system performs cross-metric correlation analysis:
- Discovers which metrics moved together
- Identifies ecosystem-wide patterns
- Detects contradictory signals
- Suggests related metrics for future analysis (using catalog for discovery)
Only strong correlations (>0.7) are reported to avoid noise.
Strict criteria ensure only truly noteworthy changes are reported:
- HIGH: >15% change AND >2 std devs AND unusual pattern AND clear business impact
- MEDIUM: >10% change OR >1.5 std devs AND somewhat unusual
- LOW/NONE: Within normal variation or expected patterns
This prevents reporting normal fluctuations as significant events.
- Parallel Processing: By default, metrics are processed with 3 parallel workers. Use
--max-workers Nto control concurrency. With 3 workers, 3 metrics complete in roughly the time of the slowest single metric. - No Automatic Retries: Retries are disabled (
max_retries=0) to prevent unnecessary credit usage. Failures are immediately reported. - Prompt Optimization: Prompts have been optimized to reduce token usage while maintaining all functionality.
- Cost Efficiency: Optimized prompts, parallel processing, and no retries minimize API costs per report.
- ClickHouse UNKNOWN_IDENTIFIER date: Mark snapshot tables as
kind: snapshotinmetrics.ymlso they don't get aWHERE datefilter. - Plots not generated: Plots are generated when possible but not guaranteed due to Code Interpreter limitations. Reports are complete and useful even without plots. You'll see a warning message if plots are missing for a metric.
- Plots not visible in HTML: Check that PNGs exist in
reports/plots/and that you open the HTML from the same directory tree (paths are relative). - No weekly report: Ensure you didn't pass
--no-summaryand that at least one metric completed successfully. - Parallel processing issues: If you encounter issues with parallel processing, try running with
--max-workers 1for sequential execution. - API connection errors: These are external API issues, not code bugs. The system will properly report failures and won't retry automatically to save credits.
- More metric kinds and templates (e.g. funnels, distributions).
- Slack / email delivery on schedule.
- Support for additional LLM providers (e.g. Gemini) behind the same connector interface.
- Better Frontend