aws-samples
diff --git a/‎.gitignore‎
Lines changed: 3 additions & 0 deletions b/‎.gitignore‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎MANIFEST.in‎
Lines changed: 3 additions & 0 deletions b/‎MANIFEST.in‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 13 additions & 26 deletions b/‎README.md‎
Lines changed: 13 additions & 26 deletions
diff --git a/‎debug.sh‎
Lines changed: 5 additions & 44 deletions b/‎debug.sh‎
Lines changed: 5 additions & 44 deletions
diff --git a/‎docs/announcement.md‎
Lines changed: 9 additions & 0 deletions b/‎docs/announcement.md‎
Lines changed: 9 additions & 0 deletions
@@ -1,4 +1,7 @@
 __pycache__/
+EC2_system_metrics.csv
+.fmbench_python311
+fmbench.egg-info/
 analytics/comparison.json
 site
 *.zip
 
@@ -0,0 +1,3 @@
+recursive-include fmbench *
+include LICENSE
+include README.md
@@ -14,7 +14,7 @@
 </h4>
 
 
-🚨 **What's new**: Benchmark the [Qwen2.5-72b](https://huggingface.co/Qwen/Qwen2.5-72B) on Amazon EC2 and the newest [llama3-3-70b](https://aws.amazon.com/about-aws/whats-new/2024/12/metas-llama-3-3-70b-model-amazon-bedrock/) model on Amazon Bedrock. Use more simplified versions of configuration files, view more [here](https://github.com/aws-samples/foundation-model-benchmarking-tool/blob/main/docs/simplified_config_files.md).  
+🚨 **What's new**: Benchmarks for [Deepseek-R1](https://github.com/deepseek-ai/DeepSeek-R1) models on Amazon EC2 and Amazon SageMaker. Faster setup with `uv` for Python venv and dependency installation.  🚨
 
 `FMBench` is a Python package for running performance benchmarks and accuracy for **any Foundation Model (FM)** deployed on **any AWS Generative AI service**, be it **Amazon SageMaker**, **Amazon Bedrock**, **Amazon EKS**, or **Amazon EC2**. The FMs could be deployed on these platforms either directly through `FMbench`, or, if they are already deployed then also they could be benchmarked through the **Bring your own endpoint** mode supported by `FMBench`. 
 
@@ -48,19 +48,12 @@ Use `FMBench` to determine model accuracy using a panel of LLM evaluators (PoLL
 
 Configuration files are available in the [configs](./src/fmbench/configs) folder for the following models in this repo.
 
-### Llama3 on Amazon SageMaker
-
-Llama3 is now available on SageMaker (read [blog post](https://aws.amazon.com/blogs/machine-learning/meta-llama-3-models-are-now-available-in-amazon-sagemaker-jumpstart/)), and you can now benchmark it using `FMBench`. Here are the config files for benchmarking `Llama3-8b-instruct` and `Llama3-70b-instruct` on `ml.p4d.24xlarge`, `ml.inf2.24xlarge` and `ml.g5.12xlarge` instances.
-
-- [Config file](https://github.com/aws-samples/foundation-model-benchmarking-tool/blob/main/src/fmbench/configs/llama3/8b/config-llama3-8b-instruct-g5-p4d.yml) for `Llama3-8b-instruct` on  `ml.p4d.24xlarge` and `ml.g5.12xlarge`.
-- [Config file](https://github.com/aws-samples/foundation-model-benchmarking-tool/blob/main/src/fmbench/configs/llama3/70b/config-llama3-70b-instruct-g5-p4d.yml) for `Llama3-70b-instruct` on  `ml.p4d.24xlarge` and `ml.g5.48xlarge`.
-- [Config file](https://github.com/aws-samples/foundation-model-benchmarking-tool/blob/main/src/fmbench/configs/llama3/8b/config-llama3-8b-inf2-g5.yml) for `Llama3-8b-instruct` on  `ml.inf2.24xlarge` and `ml.g5.12xlarge`.
-
 ### Full list of benchmarked models
 
 
 | Model                           | Amazon EC2                     | Amazon SageMaker                           | Amazon Bedrock                     |
 |:--------------------------------|:-------------------------------|:-------------------------------------------|:-----------------------------------|
+| **Deepseek-R1 distilled**        | g6e                           | g6e                                           |                            |
 | **Llama3.3-70b instruct**        |                               |                                           | On-demand                           |
 | **Qwen2.5-72b**                  | g5, g6e                       |                                           |                                    |
 | **Amazon Nova**                  |                               |                                           | On-demand                          |
@@ -90,6 +83,12 @@ Llama3 is now available on SageMaker (read [blog post](https://aws.amazon.com/bl
 
 ## New in this release
 
+## 2.1.0
+
+1. Deepseek-R1 distilled model support using [`vllm`](https://github.com/vllm-project/vllm).
+1. Evaluate Deepseek performance with `LongBench`, `OpenOrca`, `Dolly` and [`ConvFinQA`](https://huggingface.co/datasets/AdaptLLM/ConvFinQA) datasets.
+1. Replace `conda` with [`uv`](https://docs.astral.sh/uv/) for faster installs.
+
 ## 2.0.27
 
 1. Ollama end to end support
@@ -99,20 +98,6 @@ Llama3 is now available on SageMaker (read [blog post](https://aws.amazon.com/bl
 1. Bug fix for missing HuggingFace token file.
 1. Config file enhancements
 
-## 2.0.25
-
-1. Fix bug with an alternate VariantName for SageMaker BYOE.
-
-## 2.0.24
-
-1. ARM benchmarking support (AWS Graviton 4).
-1. Relax IAM permission requirements for Amazon SageMaker bring your own endpoint.
-
-## 2.0.23
-
-1. Bug fixes for Amazon SageMaker BYOE.
-1. Additional config files.
-
 
 [Release history](./release_history.md)
 
@@ -148,9 +133,11 @@ You can run `FMBench` on either a SageMaker notebook or on an EC2 VM. Both optio
 1. On the `fmbench-notebook` open a Terminal and run the following commands.
 
     ```{.bash}
-    conda create --name fmbench_python311 -y python=3.11 ipykernel
-    source activate fmbench_python311;
-    pip install -U fmbench
+    curl -LsSf https://astral.sh/uv/install.sh | sh
+    export PATH="$HOME/.local/bin:$PATH"
+    uv venv .fmbench_python311 --python 3.11
+    source .fmbench_python311/bin/activate
+    uv pip install -U fmbench
     ```
 
 1. Now you are ready to `fmbench` with the following command line. We will use a sample config file placed in the S3 bucket by the CloudFormation stack for a quick first run.
 
@@ -1,51 +1,12 @@
 # script for a debug/developer workflow
-# 1. Deletes the existing pfmbench package from the conda env
-# 2. Builds and installs a new one
-# 3. Runs fmbench as usual
+# 1. Builds and install a local wheel
+# 2. There is no step 2 :)
 
-CONDA_ENV_PATH=$CONDA_PREFIX/lib/python3.11/site-packages
-CONFIG_FILE_PATH=src/fmbench/configs/bedrock/config-bedrock-llama3-1.yml
 CONFIG_FILE_PATH=src/fmbench/configs/bedrock/config-nova-all-models.yml
-#src/fmbench/configs/generic/ec2/djl.yml
-#src/fmbench/configs/llama3.1/8b/config-llama3.1-8b-g5-ec2.yml
-#src/fmbench/configs/multimodal/bedrock/config-llama-3-2-11b-vision-instruct-scienceqa.yml
-#src/fmbench/configs/multimodal/bedrock/config-llama-3-2-11b-vision-instruct-image-dataset.yml
-#src/fmbench/configs/llama3.1/8b/config-llama3.1-8b-g6e.2xl-tp-1-mc-max-djl.yml
-#config-ec2-llama3-1-8b-g6e-2xlarge-byoe-ollama.yml
-#src/fmbench/configs/bedrock/config-bedrock-llama3-2.yml
-#src/fmbench/configs/llama3.1/8b/config-llama3.1-8b-trn1-32xl-deploy-tp-8-ec2.yml
-#src/fmbench/configs/llama3/8b/config-llama3-8b-trn1-32xlarge-triton-vllm.yml
-#src/fmbench/configs/llama3/8b/config-llama3-8b-trn1-32xlarge-triton-vllm.yml
-#src/fmbench/configs/llama3/8b/config-llama3-8b-trn1-32xlarge-triton-djl.yml
-#src/fmbench/configs/llama3/8b/config-llama3-8b-trn1-32xlarge-triton-djl.yml
-#src/fmbench/configs/llama3/8b/config-llama3-8b-trn1-32xlarge-triton-vllm.yml
-#src/fmbench/configs/llama3/8b/config-llama3-8b-g5.12xl-tp-4-mc-max-djl-ec2.yml
-#src/fmbench/configs/llama3.1/8b/config-llama3.1-8b-trn32xl-triton.yml
-#config-llama3-8b-g5.12xl-tp-4-mc-max-djl-ec2.yml
-#config-llama3-8b-g5.12xl-tp-4-mc-max-triton-ec2.yml
-#src/fmbench/configs/llama3/8b/config-llama3-8b-g5.12xl-tp-2-mc-max-triton-ec2.yml
-#src/fmbench/configs/llama3/8b/config-llama3-8b-g5.12xl-tp-2-mc-max-triton-ec2.yml
-#config-llama3.1-8b-g5.24xl-tp-4-mc-max-ec2.yml
-#config-llama3.1-8b-g5.12xl-tp-4-mc-max-ec2.yml
-#config-ec2-llama3-1-8b-p5-tp-2-mc-max.yml
-#config-llama3.1-8b-trn1-32xl-deploy-tp-8-ec2.yml
-#src/fmbench/configs/llama3.1/8b/config-ec2-llama3-1-8b-p5.yml
-#src/fmbench/configs/mistral/config-mistral-v3-inf2-48xl-deploy-ec2-tp24.yml
-#src/fmbench/configs/llama3.1/8b/config-llama3.1-8b-g5.yml
-#src/fmbench/configs/llama3/8b/config-ec2-llama3-8b-m7a-16xlarge.yml
-#src/fmbench/configs/mistral/config-mistral-v3-inf2-48xl-deploy-ec2-tp24.yml
-#bedrock/config-bedrock-llama3-1-no-streaming.yml
-#src/fmbench/configs/bedrock/config-bedrock.yml
-#src/fmbench/configs/llama3/8b/config-llama3-8b-g5-streaming.yml
-#config-bedrock-llama3-streaming.yml #config-llama3-8b-g5-stream.yml
 LOGFILE=fmbench.log
 
-# delete existing install
-rm -rf $CONDA_ENV_PATH/fmbench*
-
-# build a new version
-poetry build
-pip install -U dist/*.whl
+uv build
+uv pip install -U dist/*.whl
 
 # run the newly installed version
 echo "going to run fmbench now"
@@ -56,4 +17,4 @@ fmbench --config-file $CONFIG_FILE_PATH  --local-mode yes --write-bucket placeho
 # a custom tmp directory by setting the '--tmp-dir' argument followed by the path to that custom tmp directory. If '--tmp-dir' is not
 # provided, the default 'tmp' directory will be used.
 #fmbench --config-file $CONFIG_FILE_PATH --local-mode yes --write-bucket placeholder --tmp-dir /path/to/your_tmp_directory > $LOGFILE 2>&1
-echo "all done"
+echo "all done"
@@ -1,3 +1,12 @@
+# Release 2.1 announcement
+
+We are excited to announce some major new enhancements for `FMBench`.
+
+**Deepseek-R1 support**: The distilled version of Deepseek-R1 models are now supported for both performance benchmarking and model evaluations 🎉. You can use built in support for 4 different datasets: [`LongBench`](https://huggingface.co/datasets/THUDM/LongBench), [`Dolly`](https://huggingface.co/datasets/databricks/databricks-dolly-15k), [`OpenOrca`](https://huggingface.co/datasets/Open-Orca/OpenOrca) and [`ConvFinQA`](https://huggingface.co/datasets/AdaptLLM/finance-tasks/tree/refs%2Fconvert%2Fparquet/ConvFinQA). You can deploy the Deepseek-R1 distilled models on Amazon EC2, Amazon Bedrock or Amazon SageMaker.
+
+**Faster installs with `uv`**: We now use `uv` instead of `conda` for creating a Python environment and installing dependencies for `FMBench`.
+
+
 # Release 2.0 announcement
 
 We are excited to share news about a major FMBench release, we now have release 2.0 for FMBench that supports model evaluations through a panel of LLM evaluators🎉. With the recent feature additions to FMBench we are already seeing increased interest from customers and hope to reach even more customers and have an even greater impact. Check out all the latest and greatest features from FMBench on the FMBench website.
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+recursive-include fmbench *`
	`2`	`+include LICENSE`
	`3`	`+include README.md`