Full fine-tuning now supports gpt-oss models, alongside minor bugfixes to ensure correct loss calculations with higher gradient accumulation.
What's Changed
- Disable workflow runs on forks by default by @fynnsu in #632
- Adding GPT OSS Support by @Maxusmusti in #646
- Update numpy from <2.0 to <2.3 by @Maxusmusti in #656
- Add kernels>0.9.0 to CUDA requirements by @Maxusmusti in #658
Full Changelog: v0.11.1...v0.12.0