Model.quantize() increases memory allocated.

code is
```python
import os
os.environ["KERAS_BACKEND"] = "torch"
import torch
from keras_hub.models import Qwen3Backbone
model_name="Qwen/Qwen3-8B"
model = Qwen3Backbone.from_preset("modelscope://" + model_name)
```

<img width="413" height="108" alt="Image" src="https://github.com/user-attachments/assets/4024ab0e-d199-4b26-87a1-ba8abd5adbd4" />
  
Next 

<img width="459" height="511" alt="Image" src="https://github.com/user-attachments/assets/5fa7a64a-caf1-4bba-ae04-77d252ccd095" />

Why does Model.quantize () increase video memory usage, and is this the correct behavior?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Model.quantize() increases memory allocated. #21518

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Model.quantize() increases memory allocated. #21518

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions