pip install numpy pandas scipy ripser scikit-learn faker networkx
python main.py
===========================================================
Analyzing Dataset: E-commerce
===========================================================
1. Individual Table Analysis:
----------------------------------------
Analyzing table: customers
Shape: (1000, 6)
Dimension 0: 1 features
Strong column relationships: customer_id-email, signup_date-country
Analyzing table: orders
Shape: (5000, 5)
Dimension 0: 1 features
Dimension 1: 3 features
Strong column relationships: order_id-customer_id
2. Table Relationship Discovery:
----------------------------------------
customers <-> orders:
Confidence: 0.89
Type: one-to-many
Best join: customer_id = customer_id
Join type: one-to-many
orders <-> order_items:
Confidence: 0.95
Type: one-to-many
Best join: order_id = order_id
Join type: one-to-many
3. Suggested Joins:
----------------------------------------
Join 1:
Tables: customers + orders
Confidence: 0.89
SQL:
SELECT *
FROM customers t1
ONE_TO_MANY JOIN orders t2
ON t1.customer_id = t2.customer_id