This project uses CrewAI to convert natural language questions into SQL queries, execute them against the Chinook database, and generate visualizations using Streamlit.
- Natural Language to SQL: Ask questions in plain English, and the AI generates and executes SQL queries.
- Multi-Agent System: Two agents (SQL Analyst and BI Analyst) collaborate to generate queries and insights.
- Automated Visualizations: Generates bar or line charts based on query results.
- Modern UI: Built with Streamlit for a clean, user-friendly interface.
- Fast Setup: Uses
uvfor package management and Docker for deployment.
- Docker
- Python 3.10+
- uv
- PostgreSQL (for local testing) or a Render PostgreSQL instance
-
Start a PostgreSQL Container:
docker run --name postgres-chinook -p 5432:5432 -e POSTGRES_PASSWORD=your_password -d postgres:latest
-
Load the Chinook Database:: Download Chinook_PostgreSQL.sql from this GitHub repo [https://github.com/lerocha/chinook-database/blob/master/ChinookDatabase/DataSources/Chinook_PostgreSql.sql]
chinook.sqlthen run this command:docker exec -i postgres-chinook psql -U postgres -d postgres < chinook_postgres.sql
-
Clone the Repository (if applicable) and navigate to the directory.
-
Create the Environment File: Create a file named
.envin the root directory and add the following, filling in your details:REPLICATE_API_TOKEN="r8_YourReplicateAPIToken" DB_HOST="localhost" # or Render PostgreSQL internal host DB_PORT="5432" DB_USER="postgres" DB_PASSWORD="your_password" DB_NAME="postgres"
-
Install Dependencies: Use
uvto install the required Python packages from therequirements.txtfile.uv pip install -r requirements.txt
Execute the following command from the project's root directory:
streamlit run src/text2sql/main.pyOpen your web browser and navigate to the URL provided by Streamlit (usually http://localhost:8501).
- Push the code to a GitHub repository.
- In the Render dashboard: Create a new Web Service, select your repository, and use the provided render.yaml. Set environment variables (DB_PASSWORD, REPLICATE_API_TOKEN) in the chinook-db-credentials group. Deploy the service and access it at https://.onrender.com.
text2sql/
├── .env
├── Dockerfile
├── pyproject.toml
├── README.md
├── render.yaml
├── requirements.txt
└── src/
└── text2sql/
├── __init__.py
├── main.py
├── crew.py
└── tools/
├── db_tools.py
├── __init__.py
Here are some questions to test the system's capabilities, ranging from simple to complex:
Basic Queries:
- List all artists.
- How many tracks are in the database?
- Show all music genres.
- List all employees and their titles.
- What are the different media types available?
Intermediate Queries (Joins & Aggregations): 6. Who are the top 5 artists by the number of tracks? 7. Show the total sales for each country. 8. Which 5 tracks have been purchased the most? 9. List the top 10 longest tracks. 10. How many albums does each artist have? List the top 5.
Advanced Queries (Complex Joins, Subqueries): 11. Who are the top 5 customers by total spending? 12. What is the most popular music genre in the USA? (most tracks sold) 13. Show the total sales per year. 14. Which employee is the best sales support agent based on the number of customers they handle? 15. For customer 'Helena Holý', list all tracks she has purchased.