Why extract multiple zero_point / scale values from the same quantization group? #214

frankxyy · 2023-12-06T08:25:55Z

frankxyy
Dec 6, 2023

Hi, I read the code of gptq reconstruct and find this line:

It seems that 4 values of zero_point are read out from the matrix for the same group. From the theory of gptq quantization, isn't it that there is only one particular zero_point and scale value for one specific group?

turboderp · 2023-12-06T12:24:59Z

turboderp
Dec 6, 2023
Maintainer

It does four columns in each thread, and each column has its own scale and offset. It's faster this way, loading four offsets as one uint16, four scales as two half2s and 4 x 8 weights an int4.

1 reply

frankxyy Dec 6, 2023
Author

Got it! Thanks a lot

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Why extract multiple zero_point / scale values from the same quantization group? #214

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Why extract multiple zero_point / scale values from the same quantization group? #214

Uh oh!

frankxyy Dec 6, 2023

Replies: 1 comment · 1 reply

Uh oh!

turboderp Dec 6, 2023 Maintainer

Uh oh!

frankxyy Dec 6, 2023 Author

frankxyy
Dec 6, 2023

Replies: 1 comment 1 reply

turboderp
Dec 6, 2023
Maintainer

frankxyy Dec 6, 2023
Author