Skip to content

Commit 7540bba

Browse files
authored
Merge pull request #279 from aarora79/main
Deepseek and uv changes
2 parents 803abcc + 0dde127 commit 7540bba

File tree

336 files changed

+5754
-2757
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

336 files changed

+5754
-2757
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,7 @@
11
__pycache__/
2+
EC2_system_metrics.csv
3+
.fmbench_python311
4+
fmbench.egg-info/
25
analytics/comparison.json
36
site
47
*.zip

MANIFEST.in

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
recursive-include fmbench *
2+
include LICENSE
3+
include README.md

README.md

Lines changed: 13 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
</h4>
1515

1616

17-
🚨 **What's new**: Benchmark the [Qwen2.5-72b](https://huggingface.co/Qwen/Qwen2.5-72B) on Amazon EC2 and the newest [llama3-3-70b](https://aws.amazon.com/about-aws/whats-new/2024/12/metas-llama-3-3-70b-model-amazon-bedrock/) model on Amazon Bedrock. Use more simplified versions of configuration files, view more [here](https://github.com/aws-samples/foundation-model-benchmarking-tool/blob/main/docs/simplified_config_files.md).
17+
🚨 **What's new**: Benchmarks for [Deepseek-R1](https://github.com/deepseek-ai/DeepSeek-R1) models on Amazon EC2 and Amazon SageMaker. Faster setup with `uv` for Python venv and dependency installation. 🚨
1818

1919
`FMBench` is a Python package for running performance benchmarks and accuracy for **any Foundation Model (FM)** deployed on **any AWS Generative AI service**, be it **Amazon SageMaker**, **Amazon Bedrock**, **Amazon EKS**, or **Amazon EC2**. The FMs could be deployed on these platforms either directly through `FMbench`, or, if they are already deployed then also they could be benchmarked through the **Bring your own endpoint** mode supported by `FMBench`.
2020

@@ -48,19 +48,12 @@ Use `FMBench` to determine model accuracy using a panel of LLM evaluators (PoLL
4848

4949
Configuration files are available in the [configs](./src/fmbench/configs) folder for the following models in this repo.
5050

51-
### Llama3 on Amazon SageMaker
52-
53-
Llama3 is now available on SageMaker (read [blog post](https://aws.amazon.com/blogs/machine-learning/meta-llama-3-models-are-now-available-in-amazon-sagemaker-jumpstart/)), and you can now benchmark it using `FMBench`. Here are the config files for benchmarking `Llama3-8b-instruct` and `Llama3-70b-instruct` on `ml.p4d.24xlarge`, `ml.inf2.24xlarge` and `ml.g5.12xlarge` instances.
54-
55-
- [Config file](https://github.com/aws-samples/foundation-model-benchmarking-tool/blob/main/src/fmbench/configs/llama3/8b/config-llama3-8b-instruct-g5-p4d.yml) for `Llama3-8b-instruct` on `ml.p4d.24xlarge` and `ml.g5.12xlarge`.
56-
- [Config file](https://github.com/aws-samples/foundation-model-benchmarking-tool/blob/main/src/fmbench/configs/llama3/70b/config-llama3-70b-instruct-g5-p4d.yml) for `Llama3-70b-instruct` on `ml.p4d.24xlarge` and `ml.g5.48xlarge`.
57-
- [Config file](https://github.com/aws-samples/foundation-model-benchmarking-tool/blob/main/src/fmbench/configs/llama3/8b/config-llama3-8b-inf2-g5.yml) for `Llama3-8b-instruct` on `ml.inf2.24xlarge` and `ml.g5.12xlarge`.
58-
5951
### Full list of benchmarked models
6052

6153

6254
| Model | Amazon EC2 | Amazon SageMaker | Amazon Bedrock |
6355
|:--------------------------------|:-------------------------------|:-------------------------------------------|:-----------------------------------|
56+
| **Deepseek-R1 distilled** | g6e | g6e | |
6457
| **Llama3.3-70b instruct** | | | On-demand |
6558
| **Qwen2.5-72b** | g5, g6e | | |
6659
| **Amazon Nova** | | | On-demand |
@@ -90,6 +83,12 @@ Llama3 is now available on SageMaker (read [blog post](https://aws.amazon.com/bl
9083

9184
## New in this release
9285

86+
## 2.1.0
87+
88+
1. Deepseek-R1 distilled model support using [`vllm`](https://github.com/vllm-project/vllm).
89+
1. Evaluate Deepseek performance with `LongBench`, `OpenOrca`, `Dolly` and [`ConvFinQA`](https://huggingface.co/datasets/AdaptLLM/ConvFinQA) datasets.
90+
1. Replace `conda` with [`uv`](https://docs.astral.sh/uv/) for faster installs.
91+
9392
## 2.0.27
9493

9594
1. Ollama end to end support
@@ -99,20 +98,6 @@ Llama3 is now available on SageMaker (read [blog post](https://aws.amazon.com/bl
9998
1. Bug fix for missing HuggingFace token file.
10099
1. Config file enhancements
101100

102-
## 2.0.25
103-
104-
1. Fix bug with an alternate VariantName for SageMaker BYOE.
105-
106-
## 2.0.24
107-
108-
1. ARM benchmarking support (AWS Graviton 4).
109-
1. Relax IAM permission requirements for Amazon SageMaker bring your own endpoint.
110-
111-
## 2.0.23
112-
113-
1. Bug fixes for Amazon SageMaker BYOE.
114-
1. Additional config files.
115-
116101

117102
[Release history](./release_history.md)
118103

@@ -148,9 +133,11 @@ You can run `FMBench` on either a SageMaker notebook or on an EC2 VM. Both optio
148133
1. On the `fmbench-notebook` open a Terminal and run the following commands.
149134

150135
```{.bash}
151-
conda create --name fmbench_python311 -y python=3.11 ipykernel
152-
source activate fmbench_python311;
153-
pip install -U fmbench
136+
curl -LsSf https://astral.sh/uv/install.sh | sh
137+
export PATH="$HOME/.local/bin:$PATH"
138+
uv venv .fmbench_python311 --python 3.11
139+
source .fmbench_python311/bin/activate
140+
uv pip install -U fmbench
154141
```
155142
156143
1. Now you are ready to `fmbench` with the following command line. We will use a sample config file placed in the S3 bucket by the CloudFormation stack for a quick first run.

debug.sh

Lines changed: 5 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -1,51 +1,12 @@
11
# script for a debug/developer workflow
2-
# 1. Deletes the existing pfmbench package from the conda env
3-
# 2. Builds and installs a new one
4-
# 3. Runs fmbench as usual
2+
# 1. Builds and install a local wheel
3+
# 2. There is no step 2 :)
54

6-
CONDA_ENV_PATH=$CONDA_PREFIX/lib/python3.11/site-packages
7-
CONFIG_FILE_PATH=src/fmbench/configs/bedrock/config-bedrock-llama3-1.yml
85
CONFIG_FILE_PATH=src/fmbench/configs/bedrock/config-nova-all-models.yml
9-
#src/fmbench/configs/generic/ec2/djl.yml
10-
#src/fmbench/configs/llama3.1/8b/config-llama3.1-8b-g5-ec2.yml
11-
#src/fmbench/configs/multimodal/bedrock/config-llama-3-2-11b-vision-instruct-scienceqa.yml
12-
#src/fmbench/configs/multimodal/bedrock/config-llama-3-2-11b-vision-instruct-image-dataset.yml
13-
#src/fmbench/configs/llama3.1/8b/config-llama3.1-8b-g6e.2xl-tp-1-mc-max-djl.yml
14-
#config-ec2-llama3-1-8b-g6e-2xlarge-byoe-ollama.yml
15-
#src/fmbench/configs/bedrock/config-bedrock-llama3-2.yml
16-
#src/fmbench/configs/llama3.1/8b/config-llama3.1-8b-trn1-32xl-deploy-tp-8-ec2.yml
17-
#src/fmbench/configs/llama3/8b/config-llama3-8b-trn1-32xlarge-triton-vllm.yml
18-
#src/fmbench/configs/llama3/8b/config-llama3-8b-trn1-32xlarge-triton-vllm.yml
19-
#src/fmbench/configs/llama3/8b/config-llama3-8b-trn1-32xlarge-triton-djl.yml
20-
#src/fmbench/configs/llama3/8b/config-llama3-8b-trn1-32xlarge-triton-djl.yml
21-
#src/fmbench/configs/llama3/8b/config-llama3-8b-trn1-32xlarge-triton-vllm.yml
22-
#src/fmbench/configs/llama3/8b/config-llama3-8b-g5.12xl-tp-4-mc-max-djl-ec2.yml
23-
#src/fmbench/configs/llama3.1/8b/config-llama3.1-8b-trn32xl-triton.yml
24-
#config-llama3-8b-g5.12xl-tp-4-mc-max-djl-ec2.yml
25-
#config-llama3-8b-g5.12xl-tp-4-mc-max-triton-ec2.yml
26-
#src/fmbench/configs/llama3/8b/config-llama3-8b-g5.12xl-tp-2-mc-max-triton-ec2.yml
27-
#src/fmbench/configs/llama3/8b/config-llama3-8b-g5.12xl-tp-2-mc-max-triton-ec2.yml
28-
#config-llama3.1-8b-g5.24xl-tp-4-mc-max-ec2.yml
29-
#config-llama3.1-8b-g5.12xl-tp-4-mc-max-ec2.yml
30-
#config-ec2-llama3-1-8b-p5-tp-2-mc-max.yml
31-
#config-llama3.1-8b-trn1-32xl-deploy-tp-8-ec2.yml
32-
#src/fmbench/configs/llama3.1/8b/config-ec2-llama3-1-8b-p5.yml
33-
#src/fmbench/configs/mistral/config-mistral-v3-inf2-48xl-deploy-ec2-tp24.yml
34-
#src/fmbench/configs/llama3.1/8b/config-llama3.1-8b-g5.yml
35-
#src/fmbench/configs/llama3/8b/config-ec2-llama3-8b-m7a-16xlarge.yml
36-
#src/fmbench/configs/mistral/config-mistral-v3-inf2-48xl-deploy-ec2-tp24.yml
37-
#bedrock/config-bedrock-llama3-1-no-streaming.yml
38-
#src/fmbench/configs/bedrock/config-bedrock.yml
39-
#src/fmbench/configs/llama3/8b/config-llama3-8b-g5-streaming.yml
40-
#config-bedrock-llama3-streaming.yml #config-llama3-8b-g5-stream.yml
416
LOGFILE=fmbench.log
427

43-
# delete existing install
44-
rm -rf $CONDA_ENV_PATH/fmbench*
45-
46-
# build a new version
47-
poetry build
48-
pip install -U dist/*.whl
8+
uv build
9+
uv pip install -U dist/*.whl
4910

5011
# run the newly installed version
5112
echo "going to run fmbench now"
@@ -56,4 +17,4 @@ fmbench --config-file $CONFIG_FILE_PATH --local-mode yes --write-bucket placeho
5617
# a custom tmp directory by setting the '--tmp-dir' argument followed by the path to that custom tmp directory. If '--tmp-dir' is not
5718
# provided, the default 'tmp' directory will be used.
5819
#fmbench --config-file $CONFIG_FILE_PATH --local-mode yes --write-bucket placeholder --tmp-dir /path/to/your_tmp_directory > $LOGFILE 2>&1
59-
echo "all done"
20+
echo "all done"

docs/announcement.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,12 @@
1+
# Release 2.1 announcement
2+
3+
We are excited to announce some major new enhancements for `FMBench`.
4+
5+
**Deepseek-R1 support**: The distilled version of Deepseek-R1 models are now supported for both performance benchmarking and model evaluations 🎉. You can use built in support for 4 different datasets: [`LongBench`](https://huggingface.co/datasets/THUDM/LongBench), [`Dolly`](https://huggingface.co/datasets/databricks/databricks-dolly-15k), [`OpenOrca`](https://huggingface.co/datasets/Open-Orca/OpenOrca) and [`ConvFinQA`](https://huggingface.co/datasets/AdaptLLM/finance-tasks/tree/refs%2Fconvert%2Fparquet/ConvFinQA). You can deploy the Deepseek-R1 distilled models on Amazon EC2, Amazon Bedrock or Amazon SageMaker.
6+
7+
**Faster installs with `uv`**: We now use `uv` instead of `conda` for creating a Python environment and installing dependencies for `FMBench`.
8+
9+
110
# Release 2.0 announcement
211

312
We are excited to share news about a major FMBench release, we now have release 2.0 for FMBench that supports model evaluations through a panel of LLM evaluators🎉. With the recent feature additions to FMBench we are already seeing increased interest from customers and hope to reach even more customers and have an even greater impact. Check out all the latest and greatest features from FMBench on the FMBench website.

0 commit comments

Comments
 (0)