DQ_int4-to-bf16_dequant

INT4 dequantization to BF16 for models like moonshotai/Kimi-K2-Thinking

Inspired and based on the Deepseek V3 FP8 to BF16 dequantizer https://huggingface.co/deepseek-ai/DeepSeek-V3-Base/blob/main/inference/fp8_cast_bf16.py

Usage

usage: int4_cast_bf16_fixed.py [-h] --input-int4-hf-path INPUT_INT4_HF_PATH --output-bf16-hf-path OUTPUT_BF16_HF_PATH int4_cast_bf16_fixed.py: error: the following arguments are required: --input-int4-hf-path, --output-bf16-hf-path

NOTE: generate_index.py is added as a temp solution when the first version has not generated the safetensor indes json file. Now the conversion script should generating it.

safetensors_diff.py

Debug utility I've used to compare the original and converted safetensors side by side

Usage

python safetensors_diff.py # Show file contents python safetensors_diff.py # Diff two files

TEST-PROOF

Converted moonshotai/Kimi-K2-Thinking to BF16 then converted to GGUF and qunatized to Q3 GGUF seems working:

Zero Shot Hexa-Ball test with Kimi-K2-Thinking Q3:

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
generate_index.py		generate_index.py
int4_cast_bf16_fixed.py		int4_cast_bf16_fixed.py
int4_cast_bf16_fixed_no_idx.py		int4_cast_bf16_fixed_no_idx.py
kernel.py		kernel.py
requirements.txt		requirements.txt
safetensors_diff.py		safetensors_diff.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DQ_int4-to-bf16_dequant

Usage

safetensors_diff.py

Usage

TEST-PROOF

About

Uh oh!

Releases

Packages

Languages

csabakecskemeti/DQ_int4-to-bf16_dequant

Folders and files

Latest commit

History

Repository files navigation

DQ_int4-to-bf16_dequant

Usage

safetensors_diff.py

Usage

TEST-PROOF

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages