Skip to content

[feature]: add spargeattn search in infer stage #36

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

zhiwei-dong
Copy link
Contributor

This commit add sparge attn search support.

@zhiwei-dong zhiwei-dong requested a review from Copilot May 11, 2025 17:09
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces support for sparse attention search in the infer stage via new configuration options and corresponding code updates.

  • Added a new shell script to run the sparge tuning job.
  • Updated transformer weight loading logic to support sparse tuning.
  • Enhanced the infer stage to extract and save sparse attention parameters.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
scripts/run_wan_t2v_sparge_tune.sh Added a new bash script to run inference with sparge tuning.
lightx2v/models/networks/wan/weights/transformer_weights.py Introduced a new flag (sparge_tune) and modified weight-loading logic based on its value.
lightx2v/models/networks/wan/model.py Implemented a final infer stage that extracts and saves sparse attention parameters when tuning is enabled.
configs/wan_t2v_sparge_tune.json Provided configuration options for enabling sparge and sparse tuning.
Comments suppressed due to low confidence (1)

lightx2v/models/networks/wan/model.py:196

  • Ensure that 'SparseAttentionMeansim' is properly imported in this module, as it is used for type checking but not visible in the diff.
if isinstance(v, SparseAttentionMeansim):

@zhiwei-dong zhiwei-dong requested a review from Copilot May 11, 2025 17:12
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for sparge attention search during inference, including the ability to tune sparse attention weights. Key changes include:

  • A new shell script (scripts/run_wan_t2v_sparge_tune.sh) for running inference with the new feature.
  • Updates in transformer_weights.py and model.py to load and optionally tune sparge attention weights.
  • A new configuration file (configs/wan_t2v_sparge_tune.json) to enable and control these features.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
scripts/run_wan_t2v_sparge_tune.sh New script to set up the environment and run inference.
lightx2v/models/networks/wan/weights/transformer_weights.py Added handling for sparse tuning mode and environment variable setup.
lightx2v/models/networks/wan/model.py Integrated saving of tuned sparse attention parameters during inference.
configs/wan_t2v_sparge_tune.json New configuration to enable sparge attention search with tuning.
Comments suppressed due to low confidence (2)

lightx2v/models/networks/wan/weights/transformer_weights.py:30

  • [nitpick] Consider aligning the naming for the tuning flag (e.g., using 'sparge_tune') to more clearly relate to the 'sparge' flag, as the current mix of 'sparge' and 'sparse_tune' may be confusing.
self.sparge_tune = config.get("sparse_tune", False)

lightx2v/models/networks/wan/model.py:196

  • The class 'SparseAttentionMeansim' is referenced but not imported; please ensure that an appropriate import statement is added if it is defined in another module.
if isinstance(v, SparseAttentionMeansim):

@zhiwei-dong zhiwei-dong requested a review from Copilot May 11, 2025 17:49
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces support for sparge attention search in the inference stage of the model.

  • Added a new shell script for running the sparge attention tune.
  • Updated transformer weights and model inference to support a tuning mode for sparsified attention.
  • Added a new JSON configuration file with relevant parameters.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
scripts/run_wan_t2v_sparge_tune.sh New shell script to run inference with sparge attention tuning, managing environment variables and input paths.
lightx2v/models/networks/wan/weights/transformer_weights.py Extended to include a 'sparse_tune' flag to determine weight loading behavior for sparge attention tuning.
lightx2v/models/networks/wan/model.py Enhanced the infer stage to save weights when tuning mode is enabled using the 'sparse_tune' flag from the configuration.
configs/wan_t2v_sparge_tune.json Added configuration entries for enabling sparge attention support and tuning, including checkpoint specification.
Comments suppressed due to low confidence (2)

lightx2v/models/networks/wan/weights/transformer_weights.py:30

  • There is slight inconsistency in naming between 'sparge' and 'sparse_tune'; consider aligning the naming convention across the codebase for clarity.
self.sparge_tune = config.get("sparse_tune", False)

lightx2v/models/networks/wan/model.py:193

  • The use of 'sparse_tune' here together with 'sparge' in other parts may cause confusion; consider unifying the naming to improve code clarity.
self.sparge_tune = self.config.get("sparse_tune", False)

@zhiwei-dong zhiwei-dong marked this pull request as ready for review May 16, 2025 04:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant