finalize: rename --qat-convert to --quantize (unified QAT + PTQ)#43
Merged
Conversation
The convert pipeline in finalize re-installs fake quantizers on the loaded float weights before running torchao's convert step, which means it works equally well on plain bf16 models (standard PTQ) and on QAT-trained models (the full QAT round-trip). The flag name no longer matched its behaviour. Rename to `--quantize <recipe>` and document both modes under the unified flag. Doc updates promote the previous "Behavior on Models Without QAT" footnote to a first-class "PTQ Mode" section in qat-training.md and spell out the AMP / PTQ / QAT comparison setup. No backwards-compat shim for the old flag name. Closes #40. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Quick Start no longer references --safetensors, so the parenthetical pointing at Save Format reads as orphaned. Replace with a direct forward-reference to the Save Format section. - "(none — skip quantize)" reads awkwardly because quantize is a verb. "(none — no quantization)" scans cleanly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 14, 2026
Owner
Author
Review pass completeTwo review agents ran in parallel. Both said no blockers. Applied in commit ca80b4e
Filed as follow-ups
Skipped with reasoning
|
Three subsections walking users through the memory / compute / accuracy tradeoffs, per-recipe QAT-vs-PTQ expectations, and a workflow ordering that minimizes wasted training cycles (bf16 baseline -> PTQ -> QAT only if PTQ proved insufficient). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
forgather finalize --qat-convert <recipe>→--quantize <recipe>. The convert pipeline already re-installs fake quantizers on the loaded float weights, so it works on any source — the old name no longer matched the behaviour.--qat-recipe, keeps the QAT accuracy benefit) and PTQ (plain bf16 source, standard post-training quantization).docs/trainers/qat-training.mdto a first-class PTQ Mode section with its own example.Closes #40.
Why this is a rename, not a new flag
PR #39 verification surfaced that Forgather's sharded checkpoint saver returns plain float weights from
FakeQuantizedLinear(scale/zp inner state is non-persistent). The fix in #39 made_apply_qat_convertalways runprepare → converton the loaded weights — which incidentally also produces a valid PTQ artifact when the source was plain bf16. Issue #40 reframed from "enable PTQ" to "fix the flag name". Single flag, two modes, documented as such.Marker-file infrastructure (proposed in #41 for load-time autodetection) is deferred — not needed for this rename and would couple #40 to the eval/inference issues.
Rename surface
--qat-convert--quantizedestqat_convertquantize_recipejob_paramskeyqat_convertquantizeqat_convertquantizefinalize_model.pyfn_apply_qat_convert_apply_quantizeqatConvert/QAT_CONVERT_RECIPES/ "QAT Convert"quantize/QUANTIZE_RECIPES/ "Quantize"Test plan
forgather finalize --helpshows--quantize;--qat-convertis gone.grep -rn "qat[-_]convert\|qatConvert\|QAT_CONVERT" src/ tools/ templatelib/ docs/returns no hits.--dry-run --quantize int8-dynamic-act-int4-weight) on a real model prints "Would run quantize step with recipe 'int8-dynamic-act-int4-weight'".IntxUnpackedToInt8Tensorweights on disk (the new documented PTQ path)../build-webui.shsucceeds; FinalizeModal dropdown labeled "Quantize".--quantize <recipe>(user-side check).🤖 Generated with Claude Code