Quantization Specification

Version: v1.2.3 | Status: Active | Last Updated: March 2026

Overview

Provides Int8 and FP4 (4-bit float) quantization for neural network weights. Supports symmetric and asymmetric Int8 quantization with nibble-packed FP4 storage, plus error metrics for quality assessment.

Functional Requirements

Symmetric and asymmetric Int8 quantization with per-tensor scale and zero-point
FP4 quantization with nibble-packed storage (8x compression ratio)
Quantization error metrics: MSE, max absolute error, and SQNR

Interface

from codomyrmex.quantization import quantize_int8, dequantize_int8, quantize_fp4, dequantize_fp4, quantization_error

qt = quantize_int8(weights, scheme="asymmetric")
reconstructed = dequantize_int8(qt)
error = quantization_error(weights, reconstructed)

Exports

Int8Quantizer, QuantizedTensor, quantize_int8, dequantize_int8, FP4Quantizer, FP4Tensor, quantize_fp4, dequantize_fp4, compute_scale_zero_point, per_channel_scale, quantization_error

Navigation

Source README | AGENTS.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantization Specification

Overview

Functional Requirements

Interface

Exports

Navigation

FilesExpand file tree

SPEC.md

Latest commit

History

SPEC.md

File metadata and controls

Quantization Specification

Overview

Functional Requirements

Interface

Exports

Navigation