Skip to content

Model.quantize() increases memory allocated. #21518

@pass-lin

Description

@pass-lin

code is

import os
os.environ["KERAS_BACKEND"] = "torch"
import torch
from keras_hub.models import Qwen3Backbone
model_name="Qwen/Qwen3-8B"
model = Qwen3Backbone.from_preset("modelscope://" + model_name)
Image

Next

Image

Why does Model.quantize () increase video memory usage, and is this the correct behavior?

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions