Skip to content
View gutonosa's full-sized avatar
🌴
On vacation
🌴
On vacation

Block or report gutonosa

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
gutonosa/README.md

Hi there, I'm Guton Osa

About This Repository

This repository consolidates my research and academic projects into a single place.

Projects

A curated and regularly updated paper list on the roles and mechanisms of SFT (Supervised Fine-Tuning) and RL (Reinforcement Learning) in LLM reasoning training. Covers topics including:

  • Comparison of mechanisms between RLVR and SFT
  • The entropy mechanism in RLVR
  • GRPO in RLVR: flaws and corrections
  • Exploration-exploitation optimization in GRPO
  • Hybrid SFT-RL training

A personal academic portfolio website built with Jekyll and GitHub Pages. Contains publication records, talks, teaching materials, and other academic information.

Popular repositories Loading

  1. gutonosa gutonosa Public

    Config files for my GitHub profile.

    HTML

  2. k6 k6 Public

    Forked from grafana/k6

    A modern load testing tool, using Go and JavaScript - https://k6.io

    Go

  3. foundry foundry Public

    Forked from foundry-rs/foundry

    Foundry is a blazing fast, portable and modular toolkit for Ethereum application development written in Rust.

    Rust