Skip to content

[Pass] Implement Tile Loop Fusion#109

Open
JiaqiGuoSunlune wants to merge 13 commits intotilelang_mesh_mainfrom
u/jiaqiguo/loop_fusion
Open

[Pass] Implement Tile Loop Fusion#109
JiaqiGuoSunlune wants to merge 13 commits intotilelang_mesh_mainfrom
u/jiaqiguo/loop_fusion

Conversation

@JiaqiGuoSunlune
Copy link
Copy Markdown
Collaborator

@JiaqiGuoSunlune JiaqiGuoSunlune commented Apr 20, 2026

  1. Model Loop Fusion as a Graph Partition problem with respect to a deterministic cost function (non-negatifve)
  2. Solve the problem with dynamic programming
  3. Enable Google Test in Tilelang
  4. End-to-end testing in Python
  5. Fix a bug in tile resize inference of reduction

@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@JiaqiGuoSunlune JiaqiGuoSunlune changed the title [WIP][Pass] Implement Tile Loop Fusion [Pass] Implement Tile Loop Fusion Apr 21, 2026
Copy link
Copy Markdown
Collaborator

@firefrogliu666 firefrogliu666 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through this PR carefully and at this point I understand the main idea and implementation structure well enough.

My understanding is that this pass discovers adjacent lowered tile.scope_entry regions, builds a window-local dependence/planning problem, and rewrites them into shared execution-shell trees . The implementation follows a discovery -> problem -> plan -> rewrite decomposition. The core method to find the best rewriting plan is a DP algorithm that priorizes lowering write_cut_cost, shared_read_cost, live_range_penalty, and reorder_penalty.

The important part for me is that the pass reasons in terms of logical execution prefixes, normalized use_in / def_out, dependence depth rho, and resident values, instead of trying to do a purely syntactic rewrite.

I also checked the rewrite side and the test intent. Preserving local wrappers like LetStmt / AttrStmt, supporting partial-prefix fusion, and keeping planning window boundaries conservative all make sense to me. Overall, the decomposition into discovery, planning, and rewrite is clear, and the implementation direction looks sound.

Approved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants