Skip to content

Release plan for the 1.7k CodeContest SFT data? #57

@huchenz1

Description

@huchenz1

Hi team,

Great work on TraceRL and the TraDo models!

In Appendix B.5, you mentioned using 1.7k random SFT samples from CodeContest (generated by Qwen2.5-32B-Instruct) for the cold start.

Are there any plans to open-source this specific 1.7k SFT dataset? 'm very interested in this data, particularly to see exactly how the eos token was formatted and appended to stabilize the subsequent RL training.

It would be super helpful for reproducing the cold start and RL pipeline.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions