Sharding a model checkpoint for deepspeed usage

Hey!
I'm using a custom version of this repo to run BLOOM-175B with DeepSpeed and it works great, thank you for this! 
I was thinking of exploring using large models (such as OPT-175B) and was wondering what is the process for creating a pre-sharded, int8 deepspeed checkpoint for it, similar to https://huggingface.co/microsoft/bloom-deepspeed-inference-int8
Is there any documentation available or example scripts for this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sharding a model checkpoint for deepspeed usage #39

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Sharding a model checkpoint for deepspeed usage #39

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions