Safetensors conversion #2290

Bond099 · 2025-06-06T09:11:57Z

Description of the change

Reference

Colab Notebook

https://colab.research.google.com/drive/1naqf0sO2J40skndWbVMeQismjL7MuEjd?usp=sharing

Checklist

I have added all the necessary unit tests for my change.
I have verified that my change does not break existing code and works with all backends (TensorFlow, JAX, and PyTorch).
My PR is based on the latest changes of the main branch (if unsure, rebase the code).
I have followed the Keras Hub Model contribution guidelines in making these changes.
I have followed the Keras Hub API design guidelines in making these changes.
I have signed the Contributor License Agreement.

abheesht17 · 2025-06-06T16:27:27Z

Thanks for the PR, will take a look in a bit :)

mattdangerw

Thanks! Just left some initial comments.

keras_hub/src/utils/transformers/export_gemma_to_safetensor.py

mattdangerw · 2025-06-06T17:36:35Z

keras_hub/src/utils/transformers/export_gemma_to_safetensor.py

+
+    # Save model
+    hf_model.save_pretrained(path, safe_serialization=True)
+    print(f"Model and tokenizer saved to {path}")


Let's try to make this look more like a library util (which is the eventual intent). No print statements. Just expose export_to_hf in this file. Make a separate test files that does what is happening below, in a unit test annotated with pytest.mark.large, that converts and compares outputs.

keras_hub/src/utils/transformers/export_gemma_to_safetensor.py

mattdangerw

Let's add a unit test that calls this util and tries loading the result with transformers and seeing if it works. OK to add transformers to our ci environment here https://github.com/keras-team/keras-hub/blob/master/requirements-common.txt

mattdangerw · 2025-06-13T01:14:00Z

keras_hub/src/utils/transformers/export_gemma_to_safetensor.py

+from safetensors.torch import save_file
+
+# Set the Keras backend to jax
+os.environ["KERAS_BACKEND"] = "jax"


let's not do this, this is something we are going to export as part of the library. we actually need this to be able to run on all backends

mattdangerw · 2025-06-13T01:15:58Z

keras_hub/src/utils/transformers/export_gemma_to_safetensor.py

+import os
+
+import torch
+from safetensors.torch import save_file


does this work on all backends? or do we need to flip between versions depending on the backend? worth testing out

… into safetensors_conversion merge updated branch

mattdangerw

Nice! Please address the changes from the earlier PR as well

mattdangerw · 2025-06-19T01:16:56Z

keras_hub/src/utils/transformers/export_gemma_to_safetensors_test.py

+
+
+class TestGemmaExport(TestCase):
+    @pytest.fixture(autouse=True)


you can remove this and use self.get_temp_dir() in the test

mattdangerw · 2025-06-19T01:19:35Z

keras_hub/src/utils/transformers/export_gemma_to_safetensor.py

+        keras_model: The Keras Gemma model (e.g., GemmaCausalLM) to convert.
+        path (str): Path to save the model.safetensors, config, and tokenizer.
+
+


remove extra newline

mattdangerw · 2025-06-19T01:19:55Z

keras_hub/src/utils/transformers/export_gemma_to_safetensor.py

+        path (str): Path to save the model.safetensors, config, and tokenizer.
+
+
+    This function converts a Keras Gemma model to Hugging Face format by:


this description should live above the args section

abheesht17

Thanks, nice work!

abheesht17 · 2025-06-19T15:29:14Z

keras_hub/src/utils/transformers/export_gemma_to_safetensor.py

+
+    Args:
+        keras_model: The Keras Gemma model (e.g., GemmaCausalLM) to convert.
+        path (str): Path to save the model.safetensors, config, and tokenizer.


Following the format we follow for rest of the library:

path: str. Path of the directory to which the safetensors file, config and tokenizer will be saved.

abheesht17 · 2025-06-19T15:55:54Z

keras_hub/src/utils/transformers/export_gemma_to_safetensor.py

+    # Tie lm_head.weight to embedding weights
+    weights_dict["lm_head.weight"] = weights_dict[
+        "model.embed_tokens.weight"
+    ].clone()


This doesn't need to be transposed?

weights are tied for gemma

abheesht17 · 2025-06-19T15:56:14Z

keras_hub/src/utils/transformers/export_gemma_to_safetensor.py

+    # Make tensors contiguous before saving
+    weights_dict_contiguous = {
+        k: v.contiguous() for k, v in weights_dict.items()
+    }


Is this necessary? Won't this potentially 2x the memory?

Yes, it will use more memory , but making the tensors contiguous is necessary , it throws a value error while saving otherwise :
ValueError: You are trying to save a non contiguous tensor: model.layers.0.self_attn.q_proj.weightwhich is not allowed. It either means you are trying to save tensors which are reference of each other in which case it's recommended to save only the full tensors, and reslice at load time, or simply call.contiguous() on your tensor to pack it before saving

abheesht17 · 2025-06-19T15:57:14Z

keras_hub/src/utils/transformers/export_gemma_to_safetensor.py

+    return hf_config
+
+
+def export_to_hf(keras_model, path):


We should add the API export decorator here, similar to this: https://github.com/keras-team/keras-hub/blob/master/keras_hub/src/models/bloom/bloom_backbone.py#L15-L16

Also, do you think we should refactor some of the common code across models to a separate file? We can then expose that as the API.

So, this is how the directory keras_hub/src/utils/transformers/convert_to_safetensor/ will look like:

export.py: this will have the common code. We will expose this as the API. This will also check if we support safetensor conversion for a given passed model yet.

gemma.py: this will just have a way to create the weight dictionary for Gemma. Inside export.py, we will call the the weight conversion function specific to a specified model.

Pinging @mattdangerw to confirm if we should do this now or at a later point.

I think we could land and do the API bit a later point. Though agree it's an important concern. I'm not sure if we want a method like model.save_to_preset() or a function like some_export(model). Any thoughts?

Bond099 added 2 commits June 6, 2025 14:33

Safetensors conversion

903733b

Reformatted

9f99030

mattdangerw reviewed Jun 6, 2025

View reviewed changes

abheesht17 reviewed Jun 6, 2025

View reviewed changes

keras_hub/src/utils/transformers/export_gemma_to_safetensor.py Outdated Show resolved Hide resolved

corrected and formatted into a util file

c896fdb

mattdangerw reviewed Jun 13, 2025

View reviewed changes

Bond099 added 6 commits June 13, 2025 15:58

test cases wip

219bf37

Merge branch 'keras-team:master' into safetensors_conversion

b5cf25c

Merge branch 'safetensors_conversion' of github.com:Bond099/keras-hub…

6eaa954

… into safetensors_conversion merge updated branch

unit tests for safetensors conversion

2cbedc4

rename vocab.spm

bbb2042

reformatted

df2951a

mattdangerw added the kokoro:force-run Runs Tests on GPU label Jun 19, 2025

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Jun 19, 2025

mattdangerw requested changes Jun 19, 2025

View reviewed changes

abheesht17 reviewed Jun 19, 2025

View reviewed changes



		class TestGemmaExport(TestCase):
		@pytest.fixture(autouse=True)

		keras_model: The Keras Gemma model (e.g., GemmaCausalLM) to convert.
		path (str): Path to save the model.safetensors, config, and tokenizer.

		path (str): Path to save the model.safetensors, config, and tokenizer.


		This function converts a Keras Gemma model to Hugging Face format by:

Safetensors conversion #2290

Are you sure you want to change the base?

Safetensors conversion #2290

Conversation

Bond099 commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of the change

Reference

Colab Notebook

Checklist

Uh oh!

abheesht17 commented Jun 6, 2025

Uh oh!

mattdangerw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mattdangerw left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattdangerw left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

abheesht17 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

abheesht17 Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Bond099 commented Jun 6, 2025 •

edited

Loading

abheesht17 Jun 19, 2025 •

edited

Loading