-
Notifications
You must be signed in to change notification settings - Fork 287
Safetensors conversion #2290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Safetensors conversion #2290
Conversation
Thanks for the PR, will take a look in a bit :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Just left some initial comments.
|
||
# Save model | ||
hf_model.save_pretrained(path, safe_serialization=True) | ||
print(f"Model and tokenizer saved to {path}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to make this look more like a library util (which is the eventual intent). No print statements. Just expose export_to_hf
in this file. Make a separate test files that does what is happening below, in a unit test annotated with pytest.mark.large, that converts and compares outputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a unit test that calls this util and tries loading the result with transformers and seeing if it works. OK to add transformers to our ci environment here https://github.com/keras-team/keras-hub/blob/master/requirements-common.txt
from safetensors.torch import save_file | ||
|
||
# Set the Keras backend to jax | ||
os.environ["KERAS_BACKEND"] = "jax" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's not do this, this is something we are going to export as part of the library. we actually need this to be able to run on all backends
import os | ||
|
||
import torch | ||
from safetensors.torch import save_file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this work on all backends? or do we need to flip between versions depending on the backend? worth testing out
… into safetensors_conversion merge updated branch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Please address the changes from the earlier PR as well
|
||
|
||
class TestGemmaExport(TestCase): | ||
@pytest.fixture(autouse=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can remove this and use self.get_temp_dir()
in the test
keras_model: The Keras Gemma model (e.g., GemmaCausalLM) to convert. | ||
path (str): Path to save the model.safetensors, config, and tokenizer. | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove extra newline
path (str): Path to save the model.safetensors, config, and tokenizer. | ||
|
||
|
||
This function converts a Keras Gemma model to Hugging Face format by: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this description should live above the args section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, nice work!
|
||
Args: | ||
keras_model: The Keras Gemma model (e.g., GemmaCausalLM) to convert. | ||
path (str): Path to save the model.safetensors, config, and tokenizer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following the format we follow for rest of the library:
path: str. Path of the directory to which the safetensors file, config and tokenizer will be saved.
# Tie lm_head.weight to embedding weights | ||
weights_dict["lm_head.weight"] = weights_dict[ | ||
"model.embed_tokens.weight" | ||
].clone() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't need to be transposed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weights are tied for gemma
# Make tensors contiguous before saving | ||
weights_dict_contiguous = { | ||
k: v.contiguous() for k, v in weights_dict.items() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary? Won't this potentially 2x the memory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it will use more memory , but making the tensors contiguous is necessary , it throws a value error while saving otherwise :
ValueError: You are trying to save a non contiguous tensor:
model.layers.0.self_attn.q_proj.weightwhich is not allowed. It either means you are trying to save tensors which are reference of each other in which case it's recommended to save only the full tensors, and reslice at load time, or simply call
.contiguous() on your tensor to pack it before saving
return hf_config | ||
|
||
|
||
def export_to_hf(keras_model, path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add the API export decorator here, similar to this: https://github.com/keras-team/keras-hub/blob/master/keras_hub/src/models/bloom/bloom_backbone.py#L15-L16
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, do you think we should refactor some of the common code across models to a separate file? We can then expose that as the API.
So, this is how the directory keras_hub/src/utils/transformers/convert_to_safetensor/
will look like:
export.py
: this will have the common code. We will expose this as the API. This will also check if we support safetensor conversion for a given passed model yet.gemma.py
: this will just have a way to create the weight dictionary for Gemma. Insideexport.py
, we will call the the weight conversion function specific to a specified model.
Pinging @mattdangerw to confirm if we should do this now or at a later point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could land and do the API bit a later point. Though agree it's an important concern. I'm not sure if we want a method like model.save_to_preset()
or a function like some_export(model)
. Any thoughts?
Description of the change
Reference
Colab Notebook
https://colab.research.google.com/drive/1naqf0sO2J40skndWbVMeQismjL7MuEjd?usp=sharingChecklist