-
Notifications
You must be signed in to change notification settings - Fork 10
[feature]: add spargeattn search in infer stage #36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces support for sparse attention search in the infer stage via new configuration options and corresponding code updates.
- Added a new shell script to run the sparge tuning job.
- Updated transformer weight loading logic to support sparse tuning.
- Enhanced the infer stage to extract and save sparse attention parameters.
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
scripts/run_wan_t2v_sparge_tune.sh | Added a new bash script to run inference with sparge tuning. |
lightx2v/models/networks/wan/weights/transformer_weights.py | Introduced a new flag (sparge_tune) and modified weight-loading logic based on its value. |
lightx2v/models/networks/wan/model.py | Implemented a final infer stage that extracts and saves sparse attention parameters when tuning is enabled. |
configs/wan_t2v_sparge_tune.json | Provided configuration options for enabling sparge and sparse tuning. |
Comments suppressed due to low confidence (1)
lightx2v/models/networks/wan/model.py:196
- Ensure that 'SparseAttentionMeansim' is properly imported in this module, as it is used for type checking but not visible in the diff.
if isinstance(v, SparseAttentionMeansim):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for sparge attention search during inference, including the ability to tune sparse attention weights. Key changes include:
- A new shell script (scripts/run_wan_t2v_sparge_tune.sh) for running inference with the new feature.
- Updates in transformer_weights.py and model.py to load and optionally tune sparge attention weights.
- A new configuration file (configs/wan_t2v_sparge_tune.json) to enable and control these features.
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
File | Description |
---|---|
scripts/run_wan_t2v_sparge_tune.sh | New script to set up the environment and run inference. |
lightx2v/models/networks/wan/weights/transformer_weights.py | Added handling for sparse tuning mode and environment variable setup. |
lightx2v/models/networks/wan/model.py | Integrated saving of tuned sparse attention parameters during inference. |
configs/wan_t2v_sparge_tune.json | New configuration to enable sparge attention search with tuning. |
Comments suppressed due to low confidence (2)
lightx2v/models/networks/wan/weights/transformer_weights.py:30
- [nitpick] Consider aligning the naming for the tuning flag (e.g., using 'sparge_tune') to more clearly relate to the 'sparge' flag, as the current mix of 'sparge' and 'sparse_tune' may be confusing.
self.sparge_tune = config.get("sparse_tune", False)
lightx2v/models/networks/wan/model.py:196
- The class 'SparseAttentionMeansim' is referenced but not imported; please ensure that an appropriate import statement is added if it is defined in another module.
if isinstance(v, SparseAttentionMeansim):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces support for sparge attention search in the inference stage of the model.
- Added a new shell script for running the sparge attention tune.
- Updated transformer weights and model inference to support a tuning mode for sparsified attention.
- Added a new JSON configuration file with relevant parameters.
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
File | Description |
---|---|
scripts/run_wan_t2v_sparge_tune.sh | New shell script to run inference with sparge attention tuning, managing environment variables and input paths. |
lightx2v/models/networks/wan/weights/transformer_weights.py | Extended to include a 'sparse_tune' flag to determine weight loading behavior for sparge attention tuning. |
lightx2v/models/networks/wan/model.py | Enhanced the infer stage to save weights when tuning mode is enabled using the 'sparse_tune' flag from the configuration. |
configs/wan_t2v_sparge_tune.json | Added configuration entries for enabling sparge attention support and tuning, including checkpoint specification. |
Comments suppressed due to low confidence (2)
lightx2v/models/networks/wan/weights/transformer_weights.py:30
- There is slight inconsistency in naming between 'sparge' and 'sparse_tune'; consider aligning the naming convention across the codebase for clarity.
self.sparge_tune = config.get("sparse_tune", False)
lightx2v/models/networks/wan/model.py:193
- The use of 'sparse_tune' here together with 'sparge' in other parts may cause confusion; consider unifying the naming to improve code clarity.
self.sparge_tune = self.config.get("sparse_tune", False)
This commit add sparge attn search support.