Skip to content

EddeCCC/simple-data-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Simple Data Pipeline

Showcase for data pipelines with DuckDB, dbt, Streamlit and Prefect.

The pipeline is scheduled to run every minute via Prefect.

Fake users will be created, cleaned and loaded into a DuckDB file. Additionally, we transform loaded data via dbt. A dashboard shows the final user data.

Prerequisites

pip install -r requirements.txt

Commands

Run pipeline:

./run-pipeline.sh

Sub-Commands

Create warehouse:

database\duckdb.exe database\warehouse.duckdb  

Dashboard:

streamlit run scripts\dashboard.py

About

Simple showcase

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published