Skip to content

Commit b93b892

Browse files
authored
Merge branch 'main' into inference_tutorial
2 parents 6a96697 + c561d26 commit b93b892

File tree

66 files changed

+3392
-3253
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

66 files changed

+3392
-3253
lines changed

benchmarks/bench_galore_fused_kernels.py

Lines changed: 0 additions & 65 deletions
This file was deleted.

benchmarks/float8/training/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ Training parameters can be configured via environment variables.
1212
- `TORCHTITAN_ROOT`: Root directory of torchtitan in your local filesystem
1313
- Optional:
1414
- `FLOAT8_RECIPE_WITH_BEST_SETTINGS`: "rowwise" or "tensorwise". Applies float8 training with the specified scaling recipe, as well as additional training configs which are optimal for that scaling recipe. See `torchtitan_benchmark.sh` for more details.
15-
- `BATCH_SIZE`: Defaults to 1.
15+
- `LOCAL_BATCH_SIZE`: Defaults to 1.
1616
- `STEPS`: Defaults to 100.
1717
- `EXTRA_ARGS`: Extra arguments to pass to torchtitan training script. See [torchtitan](https://github.com/pytorch/torchtitan) docs for the full list of options.
1818

benchmarks/float8/training/torchtitan_benchmark.sh

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
# with the given parameters,
99

1010
# script arguments
11-
BATCH_SIZE=${BATCH_SIZE:-1}
11+
LOCAL_BATCH_SIZE=${LOCAL_BATCH_SIZE:-1}
1212
STEPS=${STEPS:-100}
1313

1414
# temporary log file which is deleted after performance data is parsed out and metrics are calculated.
@@ -20,7 +20,7 @@ if [ -z "${TORCHTITAN_ROOT}" ]; then
2020
echo "Usage: TORCHTITAN_ROOT=<directory> ./float8_training_benchmark.sh"
2121
echo "Optional parameters configurable via environment variables:"
2222
echo " * FLOAT8_RECIPE_WITH_BEST_SETTINGS: "rowwise" or "tensorwise". if set, use float8 training in torchtitan with the specified recipe, including the additional settings which are optimal for that recipe. otherwise, use bf16 mixed precision training."
23-
echo " * BATCH_SIZE: defaults to 1."
23+
echo " * LOCAL_BATCH_SIZE: defaults to 1."
2424
echo " * STEPS: defaults to 100."
2525
echo " * EXTRA_ARGS: additional arguments to pass to the torchtitan training script."
2626
exit 1
@@ -45,7 +45,7 @@ cd ${TORCHTITAN_ROOT}
4545
echo "float8 args: ${FLOAT8_ARGS}"
4646

4747
# run the command with the specified arguments
48-
CONFIG_FILE="./torchtitan/models/llama3/train_configs/llama3_8b.toml" ${TORCHTITAN_ROOT}/run_train.sh --training.steps=${STEPS} --training.batch_size=${BATCH_SIZE} --training.compile ${FLOAT8_ARGS} ${EXTRA_ARGS} 2>&1 | tee ${LOG_FILE}
48+
CONFIG_FILE="./torchtitan/models/llama3/train_configs/llama3_8b.toml" ${TORCHTITAN_ROOT}/run_train.sh --training.steps=${STEPS} --training.local-batch-size=${LOCAL_BATCH_SIZE} --training.compile ${FLOAT8_ARGS} ${EXTRA_ARGS} 2>&1 | tee ${LOG_FILE}
4949

5050
# return to original working directory
5151
cd $original_dir

benchmarks/fused_benchmark_utils.py

Lines changed: 0 additions & 261 deletions
This file was deleted.

docs/source/api_ref_quantization.rst

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -63,14 +63,8 @@ Quantization Primitives
6363

6464
choose_qparams_affine
6565
choose_qparams_affine_with_min_max
66-
choose_qparams_affine_floatx
6766
quantize_affine
68-
quantize_affine_floatx
6967
dequantize_affine
70-
dequantize_affine_floatx
71-
choose_qparams_and_quantize_affine_hqq
72-
fake_quantize_affine
73-
fake_quantize_affine_cachemask
7468
safe_int_mm
7569
int_scaled_matmul
7670
MappingType

0 commit comments

Comments
 (0)