Skip to content

[Feature Request] [Pass]Greedy Pipeline Scheduling #63

@firefrogliu666

Description

@firefrogliu666

Required prerequisites

  • I have searched the Issue Tracker that this hasn't already been reported. (comment there if it has.)

Motivation

Our goal of pipeline planning is to run pipline operations across heterogeneous compute devices (tensor cores, vector cores, and ODMA units) in in parallel to minimize total execution time.

We can employ a critical-path aware greedy scheduling algorithm that prioritizes commands on the longest dependency chain, ensuring bottleneck operations complete as early as possible. To maximize hardware utilization, the algorithm should support multi-iteration pipelining,
where operations from different iterations execute concurrently since they have no inter-iteration dependencies—allowing, for example, iteration 1's memory transfers to occur while iteration 0's
compute operations are still running.

Additionally, we can also perform buffer liveness analysis to reduce memory footprint by identifying when logical buffers from different iterations can safely
reuse the same physical memory locations.

Solution

No response

Alternatives

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions