AutoFlux-lite

A end to end workflow to run ingestion of data to pre-processing to training to deployment of an ML model

For more indepth docs on particular services:

Download the repository

The most easiest way to bring the repository into your local is to run this code

curl https://www.arjunrao.space/templates > temp.sh && bash temp.sh AutoFlux && rm -rf temp.sh

This is lite version of the setup and in active development

For the spark version (not in active development) change the branch to main

ML

Ingestion & Transformation

A very basic architecture of environment setup using AutoFlux-lite

Overview of the architecture

This is a lightweight version of a larger architecture that originally involved Spark, Hive, PostgreSQL, and Delta Lake. Instead of relying on these heavyweight components, this version leverages DuckDB—an embedded OLAP database—for efficient data storage and processing while keeping the system low on power consumption and compute requirements.

How It Works (Refer to the Architecture Diagram)

1. Transformation & Ingestion:

• Raw data is ingested and processed inside the dbt container. • After transformation, the cleaned data is stored in DuckDB, acting as the shared storage layer.

2. Machine Learning Pipeline:

• The ML container fetches the transformed data from DuckDB. • Data is further cleaned and preprocessed inside the ML container. • MLflow is used to: • Track experiments. • Log metrics and artifacts. • Store model versions for reproducibility.

3. Outputs:

• A model accuracy and experiment dashboard for evaluation. • A final trained model artifact ready for deployment.

This setup is ideal for environments with limited compute resources, making it accessible for local development, edge devices, and low-power machines while maintaining an efficient ML pipeline.

Step-by-Step Usage

Bring everything up

bash compose_build.sh

Will build and run every container

Verify the Setup

Check that all containers are running:

docker ps

You should see:

Run transformation

docker exec -it transformation bash

You'll get access to a transformation container where you can execute DBT commands. By default, the seed command runs automatically, and you can trigger the transformation process using dbt build, which will execute everything.

Stopping the Containers

To stop the entire setup:

docker-compose down

To remove all volumes and networks:

docker-compose down -v

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
assets		assets
ml		ml
transformation		transformation
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
compose_build.sh		compose_build.sh
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AutoFlux-lite

Download the repository

Overview of the architecture

How It Works (Refer to the Architecture Diagram)

1. Transformation & Ingestion:

2. Machine Learning Pipeline:

3. Outputs:

Step-by-Step Usage

Bring everything up

Verify the Setup

Run transformation

Stopping the Containers

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

arjunprakash027/AutoFlux

Folders and files

Latest commit

History

Repository files navigation

AutoFlux-lite

Download the repository

Overview of the architecture

How It Works (Refer to the Architecture Diagram)

1. Transformation & Ingestion:

2. Machine Learning Pipeline:

3. Outputs:

Step-by-Step Usage

Bring everything up

Verify the Setup

Run transformation

Stopping the Containers

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages