A end to end workflow to run ingestion of data to pre-processing to training to deployment of an ML model
For more indepth docs on particular services:
The most easiest way to bring the repository into your local is to run this code
curl https://www.arjunrao.space/templates > temp.sh && bash temp.sh AutoFlux && rm -rf temp.shThis is lite version of the setup and in active development
For the spark version (not in active development) change the branch to main
A very basic architecture of environment setup using AutoFlux-lite

This is a lightweight version of a larger architecture that originally involved Spark, Hive, PostgreSQL, and Delta Lake. Instead of relying on these heavyweight components, this version leverages DuckDB—an embedded OLAP database—for efficient data storage and processing while keeping the system low on power consumption and compute requirements.
• Raw data is ingested and processed inside the dbt container. • After transformation, the cleaned data is stored in DuckDB, acting as the shared storage layer.
• The ML container fetches the transformed data from DuckDB. • Data is further cleaned and preprocessed inside the ML container. • MLflow is used to: • Track experiments. • Log metrics and artifacts. • Store model versions for reproducibility.
• A model accuracy and experiment dashboard for evaluation. • A final trained model artifact ready for deployment.
This setup is ideal for environments with limited compute resources, making it accessible for local development, edge devices, and low-power machines while maintaining an efficient ML pipeline.
bash compose_build.shWill build and run every container
Check that all containers are running:
docker psYou should see:
docker exec -it transformation bashYou'll get access to a transformation container where you can execute DBT commands. By default, the seed command runs automatically, and you can trigger the transformation process using dbt build, which will execute everything.
To stop the entire setup:
docker-compose downTo remove all volumes and networks:
docker-compose down -v