[Feature] [Optimum] [Intel] [OpenVINO] Add OpenVINO backend support through Optimum-Intel #454

tjtanaa · 2024-11-07T08:33:15Z

Description

This is a PR that integrates OpenVINO backend into Infinity's Optimum Embedder class through the use of optimum-intel library.

Related Issue

If applicable, link the issue this PR addresses.

Types of Change

New feature
Documentation update

Checklist

I have read the CONTRIBUTING guidelines.
My code follows the code style of this project.
I have added tests to cover my changes.
All new and existing tests passed.
My changes generate no new warnings.
I have updated the documentation accordingly.

Additional Notes

There are multiple inferencing precisions that can be specified through in libs/infinity_emb/infinity_emb/transformer/utils_optimum.py

"ov_config":{
    "INFERENCE_PRECISION_HINT": "bf16" # it supports fp32, fp16 and bf16
}

The Inference precision hint is hardcoded to bf16 because it offers the fastest inference speed.

We have also performed MTEB evaluation test (bankclassification dataset) on the INT4 weight only quantized model with BF16 inference precision, the drop in accuracy is just 0.71%.

Based on speed and accuracy tradeoff as well as the ease-of-use, we think that settling down on a single effective configuration could enhance the user experience of infinity_emb.

License

By submitting this PR, I confirm that my contribution is made under the terms of the MIT license.

codecov-commenter · 2024-11-12T00:05:21Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 44.15584% with 43 lines in your changes missing coverage. Please review.

Project coverage is 78.20%. Comparing base (c9a8404) to head (295e840).
Report is 8 commits behind head on main.

Files with missing lines	Patch %	Lines
...nity_emb/infinity_emb/transformer/utils_optimum.py	42.30%	30 Missing ⚠️
...y_emb/infinity_emb/transformer/embedder/optimum.py	43.47%	13 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #454      +/-   ##
==========================================
- Coverage   79.18%   78.20%   -0.99%     
==========================================
  Files          41       41              
  Lines        3248     3308      +60     
==========================================
+ Hits         2572     2587      +15     
- Misses        676      721      +45

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

greptile-apps

PR Summary

This PR adds OpenVINO backend support through Optimum-Intel integration, enabling optimized inference on Intel hardware with BF16 precision.

Added OpenVINO execution provider in primitives.py and integrated model loading/optimization in utils_optimum.py with hardcoded BF16 precision
Implemented OpenVINO model file handling and caching in utils_optimum.py with get_openvino_files() function
Added CHECK_OPTIMUM_INTEL dependency check in _optional_imports.py for graceful fallback
Modified Docker.template.yaml and added Dockerfile.intel_auto to support OpenVINO builds
Updated OptimumEmbedder class to handle OpenVINO model loading while maintaining compatibility with existing ONNX runtime

_{7 file(s) reviewed, 13 comment(s)}
_{Edit PR Review Bot Settings | Greptile}

greptile-apps · 2024-11-16T04:03:49Z

libs/infinity_emb/Docker.template.yaml

    # "RUN poetry install --no-interaction --no-ansi --no-root --extras \"${EXTRAS}\" --without lint,test && poetry cache clear pypi --all"
    COPY requirements_install_from_poetry.sh requirements_install_from_poetry.sh
    RUN ./requirements_install_from_poetry.sh --no-root --without lint,test "https://download.pytorch.org/whl/cpu"
+    RUN poetry run python -m pip install --upgrade --upgrade-strategy eager "optimum[openvino]"


style: Consider pinning the optimum[openvino] version to ensure reproducible builds

greptile-apps · 2024-11-16T04:03:49Z

libs/infinity_emb/Docker.template.yaml

    RUN ./requirements_install_from_poetry.sh --no-root --without lint,test "https://download.pytorch.org/whl/cpu"
+    RUN poetry run python -m pip install --upgrade --upgrade-strategy eager "optimum[openvino]"


logic: Installing optimum[openvino] after requirements_install_from_poetry.sh may override dependency versions. Consider integrating this into the requirements script

greptile-apps · 2024-11-16T04:04:11Z

libs/infinity_emb/Dockerfile.cpu_auto

 RUN ./requirements_install_from_poetry.sh --no-root --without lint,test "https://download.pytorch.org/whl/cpu"
+RUN poetry run python -m pip install --upgrade --upgrade-strategy eager "optimum[openvino]"


logic: Installing optimum[openvino] after poetry install could override poetry-managed dependencies. Consider adding optimum[openvino] to pyproject.toml instead.

greptile-apps · 2024-11-16T04:04:12Z

libs/infinity_emb/Dockerfile.cpu_auto

 RUN ./requirements_install_from_poetry.sh  --without lint,test "https://download.pytorch.org/whl/cpu"
+RUN poetry run python -m pip install --upgrade --upgrade-strategy eager "optimum[openvino]"


logic: Redundant installation of optimum[openvino] - this package was already installed in the previous stage

greptile-apps · 2024-11-16T04:04:12Z

libs/infinity_emb/Dockerfile.cpu_auto

 RUN ./requirements_install_from_poetry.sh  --with lint,test "https://download.pytorch.org/whl/cpu"
+RUN poetry run python -m pip install --upgrade --upgrade-strategy eager "optimum[openvino]"


logic: Third redundant installation of optimum[openvino] - consider consolidating into a single installation in base image

greptile-apps · 2024-11-16T04:05:40Z

libs/infinity_emb/infinity_emb/transformer/embedder/optimum.py

+            except Exception as e:  # show error then let the optimum intel compress on the fly
+                print(str(e))


logic: printing error to stdout could mask critical failures. Consider proper error logging or propagating the exception

greptile-apps · 2024-11-16T04:05:40Z

libs/infinity_emb/infinity_emb/transformer/embedder/optimum.py

+
+        if provider == "OpenVINOExecutionProvider":
+            CHECK_OPTIMUM_INTEL.mark_required()
+            filename = ""


logic: empty filename could cause issues if get_openvino_files fails. Consider setting a default model path or handling this case explicitly

greptile-apps · 2024-11-16T04:05:41Z

libs/infinity_emb/infinity_emb/transformer/embedder/optimum.py

-            use_auth_token=True,
-            prefer_quantized="cpu" in provider.lower(),
-        )
+        elif provider == "CPUExecutionProvider":


logic: missing else clause for unsupported providers could lead to undefined model state

greptile-apps · 2024-11-16T04:06:09Z

libs/infinity_emb/infinity_emb/transformer/utils_optimum.py

+        if files_optimized:
+            file_optimized = files_optimized[-1]
+        if file_name:
+            file_optimized = file_name


logic: file_name overrides files_optimized[-1] without checking if file_name exists, could cause errors if file_name is invalid

greptile-apps · 2024-11-16T04:06:09Z

libs/infinity_emb/infinity_emb/transformer/utils_optimum.py

+    openvino_files = [p for p in repo_files if p.match(pattern)]
+
+    if len(openvino_files) > 1:
+        logger.info(f"Found {len(openvino_files)} onnx files: {openvino_files}")


syntax: log message incorrectly refers to 'onnx files' when listing OpenVINO files

echarlaix · 2025-04-28T09:03:03Z

libs/infinity_emb/infinity_emb/transformer/embedder/optimum.py

+            except Exception as e:  # show error then let the optimum intel compress on the fly
+                print(str(e))
+
+            self.model = optimize_model(


@michaelfeil you can just load your model with OVModelForFeatureExtraction (if the model wasn't already exported to the OV IR then optimum will load the pytorch model and export it on-the-fly, if already converted it will just load the OV model) without calling optimize_model which seems to call ORTOptimizer (which should be used for onnx models only). OpenVINO conversion can also be done from an onnx model (https://github.com/huggingface/optimum-intel/blob/6dbc59eb80ba7eee9d347d03f3b737fc54b46e5d/optimum/intel/openvino/modeling_base.py#L347) but usually not recommended and this feature will likely be removed from optimum-intel in the future. Let us know if we can help on this integration!

echarlaix · 2025-04-28T09:03:17Z

libs/infinity_emb/infinity_emb/transformer/embedder/optimum.py

+            except Exception as e:  # show error then let the optimum intel compress on the fly
+                print(str(e))
+
+            self.model = optimize_model(


@michaelfeil you can just load your model with OVModelForFeatureExtraction (if the model wasn't already exported to the OV IR then optimum will load the pytorch model and export it on-the-fly, if already converted it will just load the OV model) without calling optimize_model which seems to call ORTOptimizer (which should be used for onnx models only). OpenVINO conversion can also be done from an onnx model (https://github.com/huggingface/optimum-intel/blob/6dbc59eb80ba7eee9d347d03f3b737fc54b46e5d/optimum/intel/openvino/modeling_base.py#L347) but usually not recommended and this feature will likely be removed from optimum-intel in the future. Let us know if we can help on this integration!

tjtanaa added 6 commits October 30, 2024 14:45

add optimum intel optimum

c32c660

add optimum-intel code path

e727951

remove print functions

06cc5e3

add openvino cpu support to Docker.cpu

087d714

fix optimum optimized weight code path

a73316d

add openvino inference hinting as extra arguments

295e840

michaelfeil mentioned this pull request Nov 11, 2024

List available backend providers (ipex, openvino) huggingface/optimum-intel#995

Open

michaelfeil mentioned this pull request Nov 12, 2024

add openvino, trt #460

Merged

michaelfeil marked this pull request as ready for review November 16, 2024 04:03

greptile-apps bot reviewed Nov 16, 2024

View reviewed changes

tjtanaa added 3 commits November 16, 2024 06:24

fix loading optimized openvinoe model

7535101

reproduced performant openvino support

7fda9bf

install openvino 2024.5 from stable release instead

81756bd

echarlaix reviewed Apr 28, 2025

View reviewed changes

		RUN ./requirements_install_from_poetry.sh --no-root --without lint,test "https://download.pytorch.org/whl/cpu"
		RUN poetry run python -m pip install --upgrade --upgrade-strategy eager "optimum[openvino]"

		except Exception as e: # show error then let the optimum intel compress on the fly
		print(str(e))

[Feature] [Optimum] [Intel] [OpenVINO] Add OpenVINO backend support through Optimum-Intel #454

Are you sure you want to change the base?

[Feature] [Optimum] [Intel] [OpenVINO] Add OpenVINO backend support through Optimum-Intel #454

Uh oh!

Conversation

tjtanaa commented Nov 7, 2024

Description

Related Issue

Types of Change

Checklist

Additional Notes

License

Uh oh!

codecov-commenter commented Nov 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

PR Summary

Uh oh!

greptile-apps bot Nov 16, 2024

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Nov 16, 2024

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Nov 16, 2024

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Nov 16, 2024

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Nov 16, 2024

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Nov 16, 2024

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Nov 16, 2024

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Nov 16, 2024

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Nov 16, 2024

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Nov 16, 2024

Choose a reason for hiding this comment

Uh oh!

echarlaix Apr 28, 2025

Choose a reason for hiding this comment

Uh oh!

echarlaix Apr 28, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-commenter commented Nov 12, 2024 •

edited

Loading