Nikki gutonosa

Hi there, I'm Guton Osa

About This Repository

This repository consolidates my research and academic projects into a single place.

Projects

Awesome_SFT-RLVR_Mechanism

A curated and regularly updated paper list on the roles and mechanisms of SFT (Supervised Fine-Tuning) and RL (Reinforcement Learning) in LLM reasoning training. Covers topics including:

Comparison of mechanisms between RLVR and SFT
The entropy mechanism in RLVR
GRPO in RLVR: flaws and corrections
Exploration-exploitation optimization in GRPO
Hybrid SFT-RL training

Academic-Homepage

A personal academic portfolio website built with Jekyll and GitHub Pages. Contains publication records, talks, teaching materials, and other academic information.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nikki gutonosa

Achievements

Achievements

Block or report gutonosa

Hi there, I'm Guton Osa

About This Repository

Projects

Awesome_SFT-RLVR_Mechanism

Academic-Homepage

Popular repositories Loading

Uh oh!