finalize: rename --qat-convert to --quantize (unified QAT + PTQ) by jdinalt · Pull Request #43 · jdinalt/forgather

jdinalt · 2026-05-14T22:17:59Z

Summary

Rename forgather finalize --qat-convert <recipe> → --quantize <recipe>. The convert pipeline already re-installs fake quantizers on the loaded float weights, so it works on any source — the old name no longer matched the behaviour.
The same flag now documents two modes: QAT round-trip (source trained with --qat-recipe, keeps the QAT accuracy benefit) and PTQ (plain bf16 source, standard post-training quantization).
Promotes the previous "Behavior on Models Without QAT" footnote in docs/trainers/qat-training.md to a first-class PTQ Mode section with its own example.
No backwards-compat shim for the old flag.

Closes #40.

Why this is a rename, not a new flag

PR #39 verification surfaced that Forgather's sharded checkpoint saver returns plain float weights from FakeQuantizedLinear (scale/zp inner state is non-persistent). The fix in #39 made _apply_qat_convert always run prepare → convert on the loaded weights — which incidentally also produces a valid PTQ artifact when the source was plain bf16. Issue #40 reframed from "enable PTQ" to "fix the flag name". Single flag, two modes, documented as such.

Marker-file infrastructure (proposed in #41 for load-time autodetection) is deferred — not needed for this rename and would couple #40 to the eval/inference issues.

Rename surface

Layer	Old	New
CLI flag	`--qat-convert`	`--quantize`
argparse `dest`	`qat_convert`	`quantize_recipe`
`job_params` key	`qat_convert`	`quantize`
Server passthrough param	`qat_convert`	`quantize`
`finalize_model.py` fn	`_apply_qat_convert`	`_apply_quantize`
TSX state / constant / label	`qatConvert` / `QAT_CONVERT_RECIPES` / "QAT Convert"	`quantize` / `QUANTIZE_RECIPES` / "Quantize"

Test plan

forgather finalize --help shows --quantize; --qat-convert is gone.
grep -rn "qat[-_]convert\|qatConvert\|QAT_CONVERT" src/ tools/ templatelib/ docs/ returns no hits.
Dry-run smoke (--dry-run --quantize int8-dynamic-act-int4-weight) on a real model prints "Would run quantize step with recipe 'int8-dynamic-act-int4-weight'".
PTQ end-to-end on the bf16 chinchilla baseline: 29 IntxUnpackedToInt8Tensor weights on disk (the new documented PTQ path).
QAT round-trip path is structurally unchanged — the only diff is the user-facing flag name. (Already verified in torchao QAT support: --qat-recipe (trainer) + --qat-convert (finalize) #39.)
./build-webui.sh succeeds; FinalizeModal dropdown labeled "Quantize".
Webui submit through the rebuilt FinalizeModal lands a job whose argv includes --quantize <recipe> (user-side check).

🤖 Generated with Claude Code

The convert pipeline in finalize re-installs fake quantizers on the loaded float weights before running torchao's convert step, which means it works equally well on plain bf16 models (standard PTQ) and on QAT-trained models (the full QAT round-trip). The flag name no longer matched its behaviour. Rename to `--quantize <recipe>` and document both modes under the unified flag. Doc updates promote the previous "Behavior on Models Without QAT" footnote to a first-class "PTQ Mode" section in qat-training.md and spell out the AMP / PTQ / QAT comparison setup. No backwards-compat shim for the old flag name. Closes #40. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Quick Start no longer references --safetensors, so the parenthetical pointing at Save Format reads as orphaned. Replace with a direct forward-reference to the Save Format section. - "(none — skip quantize)" reads awkwardly because quantize is a verb. "(none — no quantization)" scans cleanly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

jdinalt · 2026-05-14T22:21:19Z

Review pass complete

Two review agents ran in parallel. Both said no blockers.

Applied in commit `ca80b4e`

Dropped the stale `--safetensors` parenthetical from the Quick Start (the example no longer mentions `--safetensors`, so the note was orphaned). Replaced with a direct forward-reference to the Save Format section.
FinalizeModal dropdown placeholder: `(none — skip quantize)` → `(none — no quantization)` (the verb form scanned awkwardly).

Filed as follow-ups

finalize: warn when QAT-trained source is finalized without --quantize #44 — finalize should warn when run on a `--qat-recipe`-trained source without `--quantize`. Today the saver strips fake-quant inner state, so the default finalize silently throws away the QAT training-time benefit. This is the marker-file mechanism originally proposed in forgather eval: load and evaluate torchao-quantized models #41 / inference server: load torchao-quantized models #42, applied to the finalize side.
Quantization: namespace recipe strings for future non-torchao backends #45 — namespace recipe strings (`torchao:int4-weight-only` or `--quantize-backend torchao`) before users script against bare names. Cheap to lock in now while the flag is new.

Skipped with reasoning

TSX localStorage migration of the old `qatConvert` key — reviewer Add comprehensive unit tests for forgather modules #1 explicitly said "not worth holding the PR" and the field falls back to `""` cleanly.
PTQ accuracy-claim hedge — one reviewer flagged, the other approved the existing hedging ("how much depends on the model and recipe"). Calling tie in favour of the existing wording.
`QAT_RECIPES` → `QUANTIZE_RECIPES` Python rename — the module docstring at `src/forgather/ml/qat_recipes.py` already clarifies both modes. Reviewer Add claude GitHub actions 1771648604662 #2 said the docstring is cheaper than the rename.
`--quantize` table-cell length in `finalize-model.md` — reviewer Add claude GitHub actions 1771648604662 #2 said "not blocking; content is correct".

Three subsections walking users through the memory / compute / accuracy tradeoffs, per-recipe QAT-vs-PTQ expectations, and a workflow ordering that minimizes wasted training cycles (bf16 baseline -> PTQ -> QAT only if PTQ proved insufficient). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

jdinalt and others added 2 commits May 14, 2026 22:02

This was referenced May 14, 2026

finalize: warn when QAT-trained source is finalized without --quantize #44

Open

Quantization: namespace recipe strings for future non-torchao backends #45

Open

jdinalt merged commit 5e0cf6d into dev May 15, 2026
1 check passed

jdinalt deleted the feature/finalize-quantize-rename branch May 15, 2026 02:10

jdinalt mentioned this pull request May 15, 2026

finalize: post-training quantization (PTQ) for non-QAT models #40

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

finalize: rename --qat-convert to --quantize (unified QAT + PTQ)#43

finalize: rename --qat-convert to --quantize (unified QAT + PTQ)#43
jdinalt merged 3 commits into
devfrom
feature/finalize-quantize-rename

jdinalt commented May 14, 2026

Uh oh!

jdinalt commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jdinalt commented May 14, 2026

Summary

Why this is a rename, not a new flag

Rename surface

Test plan

Uh oh!

jdinalt commented May 14, 2026

Review pass complete

Applied in commit ca80b4e

Filed as follow-ups

Skipped with reasoning

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Applied in commit `ca80b4e`