This repository provides a robust, experiment-ready training script for fine-tuning the Flux Fill (dev) model using Hugging Face Diffusers, Accelerate, and Weights & Biases (wandb). The script supports modular training, validation, experiment tracking, checkpointing, and reproducibility.
- Modular training and validation loops
- Experiment tracking and image logging with wandb
- Distributed and mixed-precision training with Accelerate
- Full checkpointing and resuming
- Configurable random seed for reproducibility
- CLI for all major training and experiment parameters
flux-fill-finetuning/
├── main.py # Main training script
├── requirements.txt # Python dependencies
├── checkpoints/ # Directory for saving model checkpoints
├── data/
│ ├── training/
│ │ ├── images/
│ │ ├── masks/
│ │ └── prompts/
│ └── validation/
│ ├── images/
│ ├── masks/
│ └── prompts/
pip install -r requirements.txt- Create a free account at wandb.ai.
- Log in from the command line (only needed once per machine):
wandb login- Paste your API key when prompted.
- Run the Accelerate config wizard (only needed once per machine):
accelerate config- Answer the prompts to match your hardware and preferences.
- Organize your data as follows:
data/training/images/,data/training/masks/,data/training/prompts/data/validation/images/,data/validation/masks/,data/validation/prompts/
- Each image/mask should have a matching filename (e.g.,
0001.pngand0001.txtfor the prompt).
Run the training script with your desired parameters:
python main.py \
--wandb_project flux-fill-finetuning \
--wandb_name my-experiment-1 \
--epochs 10 \
--batch_size 8 \
--lr 1e-4 \
--validation_epochs 1 \
--save-epochs 2 \
--seed 42Arguments:
--wandb_project(required): Name of the wandb project (umbrella for all runs)--wandb_name(required): Name for this specific experiment/run--epochs: Number of training epochs--batch_size: Batch size for training--lr: Learning rate--validation_epochs: How often (in epochs) to run validation--save-epochs: How many epochs between checkpoints (0 = only final)--seed: Random seed (optional; if not set, a random one is generated and printed)
- Checkpoints are saved in the
checkpoints/directory at the specified interval and always at the end. - To resume from a checkpoint, you can extend the script to load from a saved
.ptfile using the providedload_checkpointfunction.
- All training/validation metrics and sample images are logged to wandb.
- You can view and compare runs at wandb.ai.
- The script uses
drop_last=Falsefor DataLoaders by default. If your model requires fixed batch sizes, you can setdrop_last=Truein the script. - For best reproducibility, always set a seed with
--seed.
This project is intended for research and educational purposes.