Skip to content

torch.compile for training #4

@parthchadha

Description

@parthchadha

Is your feature request related to a problem? Please describe.

We should leverage torch.compile for training and fprop.

Describe the solution you'd like

Ideally we should be able to torch compile all calls to hf model (train, get_logprobs). The compile cost should be paid once at the init.
FYI there is a known issue with torch.compile + FSDP + BF16 and torch.compile was failing (with torch 2.5.1).

Describe alternatives you've considered

thunder compiler: https://github.com/Lightning-AI/lightning-thunder/ (we should investigate perf benefits of thunder over native compile)

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions