-
Notifications
You must be signed in to change notification settings - Fork 293
Support Loading Quantized Models with from_preset()
#2367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Support Loading Quantized Models with from_preset()
#2367
Conversation
fd28a15
to
1b07517
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
from_preset()
88e2cec
to
430d7b9
Compare
430d7b9
to
58dfab9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved comments
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively addresses an issue with loading quantized models from presets by introducing a _resolve_dtype
utility function and ensuring dtype
policies are correctly serialized. The changes are logical and well-tested. I have a couple of minor suggestions to fix a test assertion message and improve docstring formatting to align with the style guide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! just a couple nits
0161fb9
to
7eb8f1e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Description of the change
This change resolves an issue with loading quantized models from presets. Previously, the model's serialized
DTypePolicyMap
was not correctly passed to the backbone during loading, which caused failures during initialization of quantized layers.The fix introduces a new
_resolve_dtype
utility function that determines the correctdtype
for the model based on the following rules:User-specified
dtype
: If a user explicitly provides adtype
in the from_preset call (e.g.,from_preset("bert_tiny_en_uncased", num_classes=2, dtype="float32")
), that value is used.Float type casting: If no user
dtype
is provided and the saveddtype
is a floating-point type (e.g., "float32"), the model will be loaded using the current Keras defaultdtype
policy. This allows for safe casting between different floating-point precisions.DTypePolicyMap
: If no userdtype
is provided and the saveddtype
is a complex object (like aDTypePolicyMap
for quantization), the saved type is used as is. This ensures that quantization configurations are preserved during loading.Colab Notebook
Checklist