GenAI code base practice from Sheng Wang ([email protected])

Implementation of various GenAI models. Most of them have been trained for one epoch.

Currently include

Diffusion models (DDPM, DDIM, Latent diffusion, CFG, DiT)

Reinforcement learning (REINFORCE, A2C, GAE, PPO, DPO, GRPO)

Multi-modal models (LLaVA, BLIP, BLIP2, CLIP)

Pre-training (ViT, MASE, byol, beit, ibot, ssmim, Transformer, MoE, swin Transformer)

Generative models (VAE, UNet, beta-VAE, GAN, VQ-VAE)

Progress & TODOs (x for finished, o for on-going)

2025-11-11

Refactor trainer.py
Prepare tiny shakespeare /scripts/data/prepare_shakespeare.py. GPT2 tiktoken 50257 vocab size. 334646 train tokens. 3380 val tokens.
Implement text_datasets.py
Implement GPT2 and transformer block
Add Wandb
Add early stop
Test GPT2 implementation (check char-level BPE)

2025-11-12

Add unified generator
Add cosine schedular
Test GPT2 implementation (check char-level BPE)
Wrap up GPT2 on tiny shakespeare

2025-11-14

Build LLaVA model
Build coco data loader
Re-oragnize LLaVA codebase

2025-11-15

Re-split COCO valid data (original data is too large, we will only consider valid set)
Finish COCO data loader
Merge llava and gpt2 using the same trainer
LLaVA stage 1 training start
Checkpoint only saves projector, skip optimizer
Debug and test LLaVA 1 (loss is reasonable, generation is reasonable)

2025-11-16

Manually check stage 1 generation of ckpt with different val loss
LLaVA stage 2 only unfreeze last 8 layers of LLM
LLaVA stage 2 data loader and loss
Train and wrap up stage 2 on coco
Download TinyImageNet for ViT and SwinTransformer
LoRA

2025-11-17

2025-11-19

TODO

Add kv cache

Name		Name	Last commit message	Last commit date
Latest commit History 127 Commits
configs		configs
docs		docs
scripts		scripts
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
requirements.in		requirements.in

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GenAI code base practice from Sheng Wang ([email protected])

Progress & TODOs (x for finished, o for on-going)

2025-11-11

2025-11-12

2025-11-14

2025-11-15

2025-11-16

2025-11-17

2025-11-19

TODO

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

wangshenguiuc/genAI_practice

Folders and files

Latest commit

History

Repository files navigation

GenAI code base practice from Sheng Wang ([email protected])

Progress & TODOs (x for finished, o for on-going)

2025-11-11

2025-11-12

2025-11-14

2025-11-15

2025-11-16

2025-11-17

2025-11-19

TODO

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages