Skip to content

Commit df98314

Browse files
Transformers v4.12.x compatible (#107)
Update to support Huggingface Transformers v4.12.0 - v4.12.5 Added compatibility with ProphetNet
1 parent 8b29b4d commit df98314

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+2500
-3143
lines changed

README.md

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -16,17 +16,18 @@ Below shows the generation speed gain by using FastSeq.
1616

1717
| Model | W/O FastSeq (in samples/s) | W/ FastSeq (in samples/s) | Speedup |
1818
|------------------|:--------------------------:|:-------------------------:|:-----:|
19-
| [ProphetNet](examples/prophetnet/README.md) | 2.8 | 11.9 | 4.3 |
19+
| [ProphetNet (`fs`)](examples/prophetnet/README.md) | 2.8 | 11.9 | 4.3 |
2020
| [Bart (`fs`)](examples/bart/README.md) | 3.3 | 25.1 | 7.7x |
21-
| [Bart (`hf`)](examples/bart/README.md#speedup-bart-huggingface-transformers-version-by-using-fastseq) | 2.5 | 12.4 | 5.0x |
22-
| [DistilBart (`hf`)](examples/distilbart/README.md) | 3.4 | 18.5 | 5.4x |
23-
| [T5 (`hf`)](examples/t5/README.md) | 8.7 | 31.3 | 3.6x |
21+
| [Bart (`hf`)](examples/bart/README.md#speedup-bart-huggingface-transformers-version-by-using-fastseq) | 4.5 | 12.4 | 2.8x |
22+
| [DistilBart (`hf`)](examples/distilbart/README.md) | 5.5 | 19.1 | 3.5x |
23+
| [T5 (`hf`)](examples/t5/README.md) | 9.5 | 31.7 | 3.3x |
2424
| [WMT16 En-De (`fs`)](examples/wmt/README.md) | 144.5 | 422.8 | 2.9x |
25-
| [GPT2 (`hf`)](examples/gpt2/README.md) | 3.0 | 16.7 | 5.5x |
26-
| [UniLM (`hf`)](examples/unilm/README.md) | 1.7 | 16.4 | 9.6x |
25+
| [GPT2 (`hf`)](examples/gpt2/README.md) | 3.9 | 21.8 | 5.6x |
26+
| [ProphetNet (`hf`)](examples/prophetnet/README.md) | 3.4 | 6.2 | 1.8x |
2727

2828
- All benchmarking experiments run on NVIDIA-V100-16GB with [docker](docker/Dockerfile). Highest speed recorded for each model by tuning batch size. For parameter setting details, click link of corresponding model.
29-
- `fs` stands for [Fairseq](https://github.com/pytorch/fairseq) 0.10.2 version, `hf` stands for [Huggingface Transformers](https://github.com/huggingface/transformers) 3.0.2 version.
29+
- The baseline (W/O Fastseq) for [ProphetNet (`fs`)](examples/prophetnet/README.md) is run with fairseq 0.9.0, as it has not yet been updated for compatibility with version 0.10.2
30+
- `fs` stands for [Fairseq](https://github.com/pytorch/fairseq) 0.10.2 version, `hf` stands for [Huggingface Transformers](https://github.com/huggingface/transformers) 4.12.0 version.
3031
- Optimizations were automatically applied to all generation/sequence models in Fairseq & Huggingface Transformers. Above only lists a subset of them.
3132

3233
## How it works?
@@ -39,7 +40,7 @@ FastSeq develops multiple speedup techniques, including an attention cache optim
3940
- Python version >= 3.6
4041
- [torch](http://pytorch.org/) >= 1.4.0
4142
- [fairseq](https://github.com/pytorch/fairseq) >= 0.10.0
42-
- [transformers](https://github.com/huggingface/transformers) == 3.0.2
43+
- [transformers](https://github.com/huggingface/transformers) >= 4.12.0
4344
- [requests](https://pypi.org/project/requests/) >= 2.24.0
4445
- [absl-py](https://pypi.org/project/absl-py/) >= 0.9.0
4546
- [rouge-score](https://pypi.org/project/rouge-score/) >= 0.0.4

azure-pipelines.yml

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -15,21 +15,21 @@ jobs:
1515
demands:
1616
- agent.name -equals gpu3
1717
container:
18-
image: adsbrainwestus2.azurecr.io/fastseq:dev-py3
19-
endpoint: fastseq-acr
18+
image: huggingface/transformers-pytorch-gpu:latest
2019
options: --gpus device=3
2120
steps:
2221
- script: |
2322
#install fastseq
24-
which pip
25-
which python
23+
pip install --upgrade pip
24+
pip install sentencepiece==0.1.96
25+
pip install torch==1.10.0
2626
2727
echo "******* Installing fairseq *******"
2828
pip install fairseq==0.10.2
2929
pip show fairseq
3030
3131
echo "******* Installing transformers *******"
32-
pip install transformers
32+
pip install transformers==4.12.0
3333
pip show transformers
3434
3535
echo "******* Installing fastseq *******"
@@ -39,10 +39,6 @@ jobs:
3939
echo "******* Adding local bin to path *******"
4040
export PATH="$HOME/bin:$HOME/.local/bin:$PATH"
4141
42-
echo "******* Running fastseq unittests *******"
43-
pip install pytorch-transformers==1.0.0
44-
bash tests/run_fastseq_tests.sh
45-
4642
#cd benchmarks/
4743
#bash run_all_benchmarks.sh
4844
@@ -53,11 +49,16 @@ jobs:
5349
python -c "import torch; print('torch:', torch.__version__, torch)"
5450
python -c "import torch; print('CUDA available:', torch.cuda.is_available())"
5551
52+
echo "******* Running transformers unittests *******"
53+
bash tests/run_transformers_tests.sh
54+
5655
echo "******* Running fairseq unittests *******"
56+
pip install apex==0.9.10.dev0
5757
bash tests/run_fairseq_tests.sh
5858
59-
echo "******* Running transformers unittests *******"
60-
bash tests/run_transformers_tests.sh
59+
echo "******* Running fastseq unittests *******"
60+
pip install pytorch-transformers==1.0.0
61+
bash tests/run_fastseq_tests.sh
6162
6263
displayName: 'run fastseq unit tests'
6364
- task: PublishTestResults@2

benchmarks/hf.sh

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,9 @@
11
#!/bin/bash
22
source utils.sh
33
if [[ $SKIP_BASELINE -eq 0 ]]; then
4-
export BASELINE_REPO=$CACHE_DIR/transformers_v3.0.2
5-
#https://github.com/huggingface/transformers.git \
4+
export BASELINE_REPO=$CACHE_DIR/transformers_v4.12.0
65
git_clone_if_not_in_cache \
7-
https://github.com/JiushengChen/transformers.git \
6+
https://github.com/huggingface/transformers.git \
87
$BASELINE_REPO \
9-
v3.0.2-ngram
8+
v4.12.0
109
fi

benchmarks/models/hf_bart.sh

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ source hf.sh
1515
facebook/bart-large-cnn \
1616
cnn_dm/raw \
1717
val \
18-
32 \
18+
32/64 \
1919
--task summarization \
2020
--no_repeat_ngram_size 3
2121
./benchmark.sh \
@@ -33,16 +33,16 @@ grep "facebook/bart-large-cnn cnn_dm/raw val " perf \
3333
| awk -F'|' '{if($1!="NA"){c+=1;s+=$1}}END{print s/c}' \
3434
| ./range.sh 0.447 0.448
3535
# Speed on V100 16GB 250W
36-
grep -E "transformers_v3.0.2 facebook/bart-large-cnn cnn_dm/raw val 32 " perf \
36+
grep -E "transformers_v4.12.0 facebook/bart-large-cnn cnn_dm/raw val 64 " perf \
3737
| awk '{s+=$13}END{if(NR==0) print -1; else print s/NR}' \
38-
| ./range.sh 2 3
39-
grep -E "transformers_v3.0.2\+fastseq_v.* facebook/bart-large-cnn cnn_dm/raw val 32 " perf \
38+
| ./range.sh 4 5
39+
grep -E "transformers_v4.12.0\+fastseq_v.* facebook/bart-large-cnn cnn_dm/raw val 32 " perf \
4040
| awk '{s+=$13}END{print s/NR}' \
41-
| ./range.sh 7 100
42-
grep -E "transformers_v3.0.2\+fastseq_v.* facebook/bart-large-cnn cnn_dm/raw val 64 " perf \
41+
| ./range.sh 10 100
42+
grep -E "transformers_v4.12.0\+fastseq_v.* facebook/bart-large-cnn cnn_dm/raw val 64 " perf \
4343
| awk '{s+=$13}END{print s/NR}' \
4444
| ./range.sh 11 100
45-
grep -E "transformers_v3.0.2\+fastseq_v.* facebook/bart-large-cnn cnn_dm/raw val 128 " perf \
45+
grep -E "transformers_v4.12.0\+fastseq_v.* facebook/bart-large-cnn cnn_dm/raw val 128 " perf \
4646
| awk '{s+=$13}END{print s/NR}' \
4747
| ./range.sh 12 100
4848

benchmarks/models/hf_distibart.sh

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -32,12 +32,12 @@ grep "hf.sshleifer.distilbart-cnn-12-6.tar.gz cnn_dm/raw val " perf \
3232
| awk -F'|' '{if($1!="NA"){c+=1;s+=$1}}END{print s/c}' \
3333
| ./range.sh 0.45 0.452
3434
# Speed on V100 16GB 250W
35-
grep -E "transformers_v3.0.2 hf.sshleifer.distilbart-cnn-12-6.tar.gz cnn_dm/raw val 64 " perf \
35+
grep -E "transformers_v4.12.0 hf.sshleifer.distilbart-cnn-12-6.tar.gz cnn_dm/raw val 128 " perf \
3636
| awk '{s+=$13}END{if(NR==0) print -1; else print s/NR}' \
37-
| ./range.sh 3 4
38-
grep -E "transformers_v3.0.2\+fastseq_v.* hf.sshleifer.distilbart-cnn-12-6.tar.gz cnn_dm/raw val 64 " perf \
37+
| ./range.sh 5 6
38+
grep -E "transformers_v4.12.0\+fastseq_v.* hf.sshleifer.distilbart-cnn-12-6.tar.gz cnn_dm/raw val 64 " perf \
3939
| awk '{s+=$13}END{print s/NR}' \
40-
| ./range.sh 16.5 100
41-
grep -E "transformers_v3.0.2\+fastseq_v.* hf.sshleifer.distilbart-cnn-12-6.tar.gz cnn_dm/raw val 128 " perf \
40+
| ./range.sh 17 100
41+
grep -E "transformers_v4.12.0\+fastseq_v.* hf.sshleifer.distilbart-cnn-12-6.tar.gz cnn_dm/raw val 128 " perf \
4242
| awk '{s+=$13}END{print s/NR}' \
43-
| ./range.sh 18.3 100
43+
| ./range.sh 18 100

benchmarks/models/hf_gpt2.sh

Lines changed: 13 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,6 @@
77
# <split> # train/val/test (text) or train/valid/test (binary)
88
# <batch-sizes>
99
source hf.sh
10-
1110
# MODEL - bart large cnn from transformer
1211
# TASK - cnn dm val full set
1312

@@ -16,7 +15,7 @@ source hf.sh
1615
gpt2 \
1716
cnn_dm/raw \
1817
val \
19-
64/128 \
18+
64/128/256 \
2019
--task summarization \
2120
--no_repeat_ngram_size 3 \
2221
--max_tokenizer_length 512 \
@@ -27,7 +26,7 @@ source hf.sh
2726
gpt2 \
2827
cnn_dm/raw \
2928
val \
30-
64 \
29+
64/128 \
3130
--task summarization \
3231
--no_repeat_ngram_size 3 \
3332
--max_tokenizer_length 512 \
@@ -37,14 +36,18 @@ source hf.sh
3736
grep "gpt2 cnn_dm/raw val " perf \
3837
| awk '{print $9}' \
3938
| awk -F'|' '{if($1!="NA"){c+=1;s+=$1}}END{print s/c}' \
40-
| ./range.sh 0.155 0.156
39+
| ./range.sh 0.160 0.162
4140
# Speed on V100 16GB 250W
42-
grep -E "transformers_v3.0.2 gpt2 cnn_dm/raw val 64 " perf \
41+
grep -E "transformers_v4.12.0 gpt2 cnn_dm/raw val 64 " perf \
4342
| awk '{s+=$13}END{if(NR==0) print -1; else print s/NR}' \
44-
| ./range.sh 2.9 3.2
45-
grep -E "transformers_v3.0.2\+fastseq_v.* gpt2 cnn_dm/raw val 64 " perf \
43+
| ./range.sh 3.5 4.5
44+
grep -E "transformers_v4.12.0\+fastseq_v.* gpt2 cnn_dm/raw val 64 " perf \
45+
| awk '{s+=$13}END{print s/NR}' \
46+
| ./range.sh 16 100
47+
grep -E "transformers_v4.12.0\+fastseq_v.* gpt2 cnn_dm/raw val 128 " perf \
4648
| awk '{s+=$13}END{print s/NR}' \
47-
| ./range.sh 10.8 11.3
48-
grep -E "transformers_v3.0.2\+fastseq_v.* gpt2 cnn_dm/raw val 128 " perf \
49+
| ./range.sh 20 100
50+
grep -E "transformers_v4.12.0\+fastseq_v.* gpt2 cnn_dm/raw val 256 " perf \
4951
| awk '{s+=$13}END{print s/NR}' \
50-
| ./range.sh 16.4 16.8
52+
| ./range.sh 21 100
53+

benchmarks/models/hf_mbart.sh

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,11 +26,11 @@ source hf.sh
2626
# Accuracy
2727
grep "facebook/mbart-large-en-ro wmt_en_ro/raw val " perf \
2828
| awk '{if($8!="NA"){c+=1;s+=$8}}END{print s/c}' \
29-
| ./range.sh 56.1 56.3
29+
| ./range.sh 56.1 56.4
3030
# Speed on V100 16GB 250W
31-
grep -E "transformers_v3.0.2 facebook/mbart-large-en-ro wmt_en_ro/raw val 64 " perf \
31+
grep -E "transformers_v4.12.0 facebook/mbart-large-en-ro wmt_en_ro/raw val 64 " perf \
3232
| awk '{s+=$13}END{if(NR==0) print -1; else print s/NR}' \
3333
| ./range.sh 6.0 100
34-
grep -E "transformers_v3.0.2\+fastseq_v.* facebook/mbart-large-en-ro wmt_en_ro/raw val 64 " perf \
34+
grep -E "transformers_v4.12.0\+fastseq_v.* facebook/mbart-large-en-ro wmt_en_ro/raw val 64 " perf \
3535
| awk '{s+=$13}END{print s/NR}' \
3636
| ./range.sh 9 100

benchmarks/models/hf_prophetnet.sh

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
#!/bin/bash
2+
# Run it at its parent folder, and check result at ../perf.
3+
# USAGE - ./benchmark.sh
4+
# [fairseq|fairseq+fastseq|transformers|transformers+fastseq]
5+
# <model>
6+
# <task>
7+
# <split> # train/val/test (text) or train/valid/test (binary)
8+
# <batch-sizes>
9+
source hf.sh
10+
11+
# MODEL - prophetnet from transformer
12+
# TASK - cnn dm val full set
13+
./benchmark.sh \
14+
transformers \
15+
microsoft/prophetnet-large-uncased \
16+
cnn_dm_bert/raw \
17+
val \
18+
128 \
19+
--task summarization \
20+
--no_repeat_ngram_size 3
21+
./benchmark.sh \
22+
transformers+fastseq \
23+
microsoft/prophetnet-large-uncased \
24+
cnn_dm_bert/raw \
25+
val \
26+
128 \
27+
--task summarization \
28+
--no_repeat_ngram_size 3
29+
30+
# Accuracy
31+
grep "microsoft/prophetnet-large-uncased cnn_dm_bert/raw val " perf \
32+
| awk '{print $9}' \
33+
| awk -F'|' '{if($1!="NA"){c+=1;s+=$1}}END{print s/c}' \
34+
| ./range.sh 0.230 0.232
35+
# Speed on V100 16GB 250W
36+
grep -E "transformers_v4.12.0 microsoft/prophetnet-large-uncased cnn_dm_bert/raw val 128 " perf \
37+
| awk '{s+=$13}END{if(NR==0) print -1; else print s/NR}' \
38+
| ./range.sh 3 4
39+
grep -E "transformers_v4.12.0+fastseq_v.* microsoft/prophetnet-large-uncased cnn_dm_bert/raw val 128 " perf \
40+
| awk '{s+=$13}END{if(NR==0) print -1; else print s/NR}' \
41+
| ./range.sh 6 100

benchmarks/models/hf_t5.sh

Lines changed: 13 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -14,28 +14,30 @@ source hf.sh
1414
wmt_en_ro/raw \
1515
val \
1616
64 \
17-
--task translation_en_to_ro
18-
# --no_repeat_ngram_size 3 # baseline don't support this arg now.
17+
--task translation_en_to_ro \
18+
--no_repeat_ngram_size 3
19+
1920
./benchmark.sh \
2021
transformers+fastseq \
2122
t5-base \
2223
wmt_en_ro/raw \
2324
val \
2425
64/128 \
2526
--task translation_en_to_ro \
26-
--postprocess_workers 3
27-
# --no_repeat_ngram_size 3
28-
# Accuracy
27+
--postprocess_workers 3 \
28+
--no_repeat_ngram_size 3
29+
30+
# # Accuracy
2931
grep "t5-base wmt_en_ro/raw val " perf \
3032
| awk '{if($8!="NA"){c+=1;s+=$8}}END{print s/c}' \
31-
| ./range.sh 57.8 57.9
33+
| ./range.sh 58.0 59.0
3234
# Speed on V100 16GB 250W
33-
grep -E "transformers_v3.0.2 t5-base wmt_en_ro/raw val 64 " perf \
35+
grep -E "transformers_v4.12.0 t5-base wmt_en_ro/raw val 64 " perf \
3436
| awk '{s+=$13}END{if(NR==0) print -1; else print s/NR}' \
35-
| ./range.sh 8 10
36-
grep -E "transformers_v3.0.2\+fastseq_v.* t5-base wmt_en_ro/raw val 64 " perf \
37+
| ./range.sh 12 17
38+
grep -E "transformers_v4.12.0\+fastseq_v.* t5-base wmt_en_ro/raw val 64 " perf \
3739
| awk '{s+=$13}END{print s/NR}' \
38-
| ./range.sh 19 100
39-
grep -E "transformers_v3.0.2\+fastseq_v.* t5-base wmt_en_ro/raw val 128 " perf \
40+
| ./range.sh 23 100
41+
grep -E "transformers_v4.12.0\+fastseq_v.* t5-base wmt_en_ro/raw val 128 " perf \
4042
| awk '{s+=$13}END{print s/NR}' \
4143
| ./range.sh 30 100

benchmarks/models/hf_unilm.sh

Lines changed: 0 additions & 39 deletions
This file was deleted.

0 commit comments

Comments
 (0)