Skip to content

[AIG] Add LongestPathAnalysis implementation #8529

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

uenoku
Copy link
Member

@uenoku uenoku commented Jun 3, 2025

This patch implements the LongestPathAnalysis for the AIG dialect, which calculates the maximum delay through combinational paths in a circuit where each AIG operation has a unit delay. The implementation:

  • Adds the core analysis infrastructure with three main components:
    • Context: Thread-safe interface to access analysis results across modules
    • LocalVisitor: Per-module visitor that computes paths within module boundaries
    • Impl: Top-level implementation that orchestrates analysis across hierarchies
  • Implements hierarchical analysis that works across module boundaries
  • Supports parallel execution for improved performance on large designs
  • Provides both closed paths (register-to-register) and open paths analysis
  • Includes debug point tracing for detailed path information
  • Adds a PrintLongestPathAnalysis pass for outputting results

The analysis is designed to be scalable for large designs through techniques like redundant path elimination, graph parallelism, and early pruning of paths. It's particularly useful for identifying critical paths, estimating circuit performance, and guiding optimization passes.

@uenoku uenoku force-pushed the dev/hidetou/longestpath-impl branch 5 times, most recently from a19c629 to 3e65f98 Compare June 3, 2025 00:42
This patch implements the LongestPathAnalysis for the AIG dialect, which calculates
the maximum delay through combinational paths in a circuit where each AIG operation
has a unit delay. The implementation:

- Adds the core analysis infrastructure with three main components:
  - Context: Thread-safe interface to access analysis results across modules
  - LocalVisitor: Per-module visitor that computes paths within module boundaries
  - Impl: Top-level implementation that orchestrates analysis across hierarchies
- Implements hierarchical analysis that works across module boundaries
- Supports parallel execution for improved performance on large designs
- Provides both closed paths (register-to-register) and open paths analysis
- Includes debug point tracing for detailed path information
- Adds a PrintLongestPathAnalysis pass for outputting results

The analysis is designed to be scalable for large designs through techniques like
redundant path elimination, graph parallelism, and early pruning of paths. It's
particularly useful for identifying critical paths, estimating circuit performance,
and guiding optimization passes.
@uenoku uenoku force-pushed the dev/hidetou/longestpath-impl branch from 3e65f98 to c92f0c0 Compare June 3, 2025 01:27
@uenoku uenoku marked this pull request as ready for review June 3, 2025 01:46
Copy link
Contributor

@fabianschuiki fabianschuiki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really really cool! I love the split into a thread-safe global and a multi-threaded local part of the analysis to make things scale. And using equivalence classes to deal with aliasing operations is very clever 🥳 💯 Can't wait to see what we can do with this in CIRCT 🚀

Comment on lines +14 to +17
// expected-remark-re @+4 {{fanOut=Object($root.x[0]), fanIn=Object($root.a[0], delay=1, history=[{{.+}}])}}
// expected-remark-re @+3 {{fanOut=Object($root.x[0]), fanIn=Object($root.b[0], delay=1, history=[{{.+}}])}}
// expected-remark-re @+2 {{fanOut=Object($root.x[1]), fanIn=Object($root.a[1], delay=1, history=[{{.+}}])}}
// expected-remark-re @+1 {{fanOut=Object($root.x[1]), fanIn=Object($root.b[1], delay=1, history=[{{.+}}])}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could @below work for these, instead of counting the number of lines?

Comment on lines +65 to +66
// expected-remark @+2 {{fanOut=Object($root.x[0]), fanIn=Object($root.a[0], delay=2, history=[Object($root.c2.x[0], delay=2, comment="output port"), Object($root/c2:child.a[0], delay=1, comment="input port"), Object($root.c1.x[0], delay=1, comment="output port"), Object($root/c1:child.a[0], delay=0, comment="input port"), Object($root.a[0], delay=0, comment="input port")])}}
// expected-remark @+1 {{fanOut=Object($root.x[0]), fanIn=Object($root.b[0], delay=2, history=[Object($root.c2.x[0], delay=2, comment="output port"), Object($root/c2:child.a[0], delay=1, comment="input port"), Object($root.c1.x[0], delay=1, comment="output port"), Object($root/c1:child.b[0], delay=0, comment="input port"), Object($root.b[0], delay=0, comment="input port")])}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really really cool 🥳

Comment on lines +29 to +42
const char *ir = R"MLIR(
hw.module private @basic(in %clock : !seq.clock, in %a : i2, in %b : i2, out x : i2, out y: i4) {
%p = seq.firreg %a clock %clock : i2
%q = seq.firreg %s clock %clock : i2
%r = hw.instance "inst" @child(a: %p: i2, b: %b: i2) -> (x: i2)
%s = aig.and_inv not %p, %q, %r : i2
%dummy = comb.concat %s, %a : i2, i2
hw.output %s, %dummy : i2, i4
}
hw.module private @child(in %a : i2, in %b : i2, out x : i2) {
%r = aig.and_inv not %a, %b {sv.namehint = "child.r"} : i2
hw.output %r : i2
}
)MLIR";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if there is some precedence for having MLIR blobs encoded in the C++ source as strings? This feels sensible since you are trying to go through a lot of the details provided by the analysis. On the other hand, you could argue that maybe the printer should have a verbose option where it prints all the interesting info you'd want to check, which would then allow you to check this with FileCheck.

//===----------------------------------------------------------------------===//
// Context
//===----------------------------------------------------------------------===//
// This class provides a thread-safe interface to access the analysis results.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// This class provides a thread-safe interface to access the analysis results.
/// This class provides a thread-safe interface to access the analysis results.

Comment on lines +383 to +386
// Track equivalence classes for data movement ops
// Treat output as equivalent to input for pure data movement operations
llvm::EquivalenceClasses<std::pair<Value, size_t>> ec;
DenseMap<std::pair<Value, size_t>, std::pair<Value, size_t>> ecMap;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice use of EquivalenceClasses! 🥳 I'm wondering if you still need to track that separate ecMap 🤔 It looks like you use it to map from an entry in the class to the leader, which I think the ec can do for you directly 🤔 Might be mistaken though.

Comment on lines +507 to +509
// Wait for the thread to finish.
while (!done.load())
std::this_thread::sleep_for(std::chrono::milliseconds(100));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth using a condition variable here to avoid sleeping. Not sure how much of a performance implication this has in practice 🙂

Comment on lines +559 to +566
auto leader = ec.getOrInsertLeaderValue({to, toBitPos});
auto &slot = ecMap[leader];
if (!slot.first)
slot = {to, toBitPos};
// Merge classes, and visit the leader.
auto newLeader = ec.unionSets({to, toBitPos}, {from, fromBitPos});
assert(leader == *newLeader);
return visitValue(to, toBitPos, results);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice idea to get rid of aliases in the IR and boil things down to a single computed value regardless of how many wires and bit slices you have of that value 💯

Comment on lines +651 to +655
auto *node = ctx->instanceGraph->lookup(moduleName);
if (!node || !node->getModule()) {
op.emitError() << "module " << moduleName << " not found";
return failure();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we could turn this into an assert? I think the hw.instance verifier would already error if the module did not exist 🤔

Comment on lines +746 to +750
auto cost = std::max(1u, llvm::Log2_64_Ceil(size));
// Create edges each operand with cost ceil(log(size)).
for (auto operand : op->getOperands())
if (failed(addEdge(operand, bitPos, cost, results)))
return failure();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the logarithm as an approximation for the ideal work depth of a logic op is a nice idea! If I remember correctly there was even some theoretical work by Torsten Hoefler showing that this holds in many cases.

Comment on lines -179 to +1324
LongestPathAnalysis::~LongestPathAnalysis() {}
LongestPathAnalysis::~LongestPathAnalysis() { delete impl; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could impl be an std::unique_ptr to avoid having to deal with naked pointers and allocation/deallocation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants