Skip to content

Conversation

@aryasaatvik
Copy link

Related Issue

Fixes import errors when using --no-bettertransformer flag with transformers >= 4.49.

Summary

This PR implements lazy imports for BetterTransformer to prevent import errors when it's disabled or when using incompatible transformers versions. BetterTransformer is deprecated in optimum and requires transformers < 4.49.

Changes:

  • Replaced eager imports with lazy loading pattern
  • Added _import_bettertransformer() function for on-demand imports
  • Wrapped imports in try/except to handle version conflicts gracefully
  • Returns early when --no-bettertransformer is used, avoiding unnecessary imports

Why this is needed:

  • BetterTransformer is deprecated and incompatible with transformers >= 4.49
  • Users with modern transformers versions get import errors even with --no-bettertransformer
  • This allows the flag to work as intended

Checklist

  • I have read the CONTRIBUTING guidelines.
  • I have added tests to cover my changes.
  • I have updated the documentation (docs folder) accordingly.

Additional Notes

This change ensures backward compatibility for users who still want BetterTransformer while fixing the issue for users with newer dependencies.

@aryasaatvik aryasaatvik changed the title Fix: Implement lazy imports for BetterTransformer to handle version conflicts fix: Implement lazy imports for BetterTransformer Jul 10, 2025
@aryasaatvik aryasaatvik changed the title fix: Implement lazy imports for BetterTransformer fix: implement lazy imports for BetterTransformer Jul 10, 2025
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

Implements lazy loading pattern for BetterTransformer in libs/infinity_emb/infinity_emb/transformer/acceleration.py to resolve import conflicts with newer transformers versions (>=4.49).

  • Introduced _import_bettertransformer() function for on-demand imports of BetterTransformer components
  • Added version compatibility checks for transformers library
  • Wrapped BetterTransformer imports in try/except blocks for graceful failure handling
  • Improved early return logic when BetterTransformer is disabled via --no-bettertransformer flag

1 file reviewed, 1 comment
Edit PR Review Bot Settings | Greptile

)
try:
model = BetterTransformer.transform(model)
model = BetterTransformer.transform(model) # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: The type: ignore comment should explain why the type is being ignored, e.g. # type: ignore[attr-defined] since BetterTransformer could be None

@codecov-commenter
Copy link

codecov-commenter commented Aug 27, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 72.22222% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.54%. Comparing base (f84d8e2) to head (947044d).

Files with missing lines Patch % Lines
...inity_emb/infinity_emb/transformer/acceleration.py 72.22% 5 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #619   +/-   ##
=======================================
  Coverage   79.54%   79.54%           
=======================================
  Files          43       43           
  Lines        3486     3501   +15     
=======================================
+ Hits         2773     2785   +12     
- Misses        713      716    +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Owner

@michaelfeil michaelfeil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wirthual can you check this out? Maybe we should vendor / maintain bettertransformer. Bettertransformer is ~50% faster than regular torch.

@wirthual
Copy link
Collaborator

Absolutely. According to this issue:

please use transformers' attention implementation: https://huggingface.co/docs/transformers/main/en/llm_optims#attention
and torch.compile (with static cache if decoder): https://huggingface.co/docs/transformers/main/en/llm_optims#static-kv-cache-and-torchcompile for the best possible performance (exceeding bettertransformer, which no one maintains! 💀).

What are your thoughts on this method vs continue support for current bettertransformer?

@wirthual
Copy link
Collaborator

@michaelfeil One option to keep bettertransformer as was is #641. Extracted bettertransformer code in its own package. This should get rid of version error while keeping the orig bettertransformer code. Not sure if/how long this method can work.

@matfax matfax mentioned this pull request Aug 29, 2025
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants