4343
4444class QuantizationMixin (HooksMixin ):
4545 """
46- Mixin which enables a Modifier to act as a quantization config, attching observers,
46+ Mixin which enables a Modifier to act as a quantization config, attaching observers,
4747 calibration hooks, and compression wrappers to modifiers
4848
4949 Lifecycle:
50- - on_initialize: QuantizationMixin.initialize_quantization
51- - Attach schemes to modules
52- - Attach observers to modules
53- - Disable quantization until calibration starts/finishes
54- - on_start: QuantizationMixin.start_calibration
55- - Attach calibration hooks
56- - Apply calibration status
57- - Enable quantization during calibration
58- - on_end: QuantizationMixin.end_calibration
59- - Remove calibration hooks
60- - Apply freeze status
61- - Keep quantization enabled for future steps
62- NOTE: QuantizationMixin does not update scales and zero-points on its own,
63- as this is not desired for all Modifiers inheriting from it. Modifier must
64- explicitly call `update_weight_zp_scale`.
65- See QuantizationModifier.on_start method for example
50+
51+ - on_initialize: QuantizationMixin.initialize_quantization
52+ - Attach schemes to modules
53+ - Attach observers to modules
54+ - Disable quantization until calibration starts/finishes
55+ - on_start: QuantizationMixin.start_calibration
56+ - Attach calibration hooks
57+ - Apply calibration status
58+ - Enable quantization during calibration
59+ - on_end: QuantizationMixin.end_calibration
60+ - Remove calibration hooks
61+ - Apply freeze status
62+ - Keep quantization enabled for future steps
63+
64+ NOTE: QuantizationMixin does not update scales and zero-points on its own,
65+ as this is not desired for all Modifiers inheriting from it. Modifier must
66+ explicitly call `update_weight_zp_scale`.
67+ See QuantizationModifier.on_start method for example
6668
6769 :param config_groups: dictionary specifying quantization schemes to apply to target
6870 modules. Modules not matching a scheme target will NOT be quantized.
@@ -85,7 +87,7 @@ class QuantizationMixin(HooksMixin):
8587 the kv_cache_scheme gets converted into a QuantizationScheme that:
8688 - targets the `q_proj` and `k_proj` modules of the model. The outputs
8789 of those modules are the keys and values that might be cached
88- - quantizes the outputs of the aformentioned layers, so that
90+ - quantizes the outputs of the aforementioned layers, so that
8991 keys and values are compressed before storing them in the cache
9092 There is an explicit assumption that the model contains modules with
9193 `k_proj` and `v_proj` in their names. If this is not the case
0 commit comments