Skip to content

[Datapath] Implement Add with Carry-In to Improve Subtraction Circuit Implementation#9949

Draft
sarthakmangla1 wants to merge 1 commit intollvm:mainfrom
sarthakmangla1:comb-sub-carryin-v2
Draft

[Datapath] Implement Add with Carry-In to Improve Subtraction Circuit Implementation#9949
sarthakmangla1 wants to merge 1 commit intollvm:mainfrom
sarthakmangla1:comb-sub-carryin-v2

Conversation

@sarthakmangla1
Copy link
Copy Markdown

@sarthakmangla1 sarthakmangla1 commented Mar 16, 2026

This is a draft PR. This is my first PR to CIRCT.

Summary

Subtraction a - b is lowered to a + ~b + 1. Previously, the constant 1 was passed as a full-width data input through the Wallace tree
compressor, and then a separate parallel prefix adder was used on top, which wastes gates.

This PR fixes the issue in two places:

  1. DatapathToComb: Detects compress(a, b, const_1) followed by a downstream comb.add, and replaces both with a single comb.add(a, b, const_1).

  2. CombToSynth: Detects a 3-input comb.add with a const_1 argument and bakes the carry-in directly into g[0] of the parallel prefix
    adder, eliminating the extra adder stage.

Results

For a 4-bit subtractor, gate count drops from 55 to 47.

TODO

  • Add FileCheck tests for the new patterns

Copy link
Copy Markdown
Member

@uenoku uenoku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thank you for contributing to CIRCT! I launched benchmark https://uenoku.github.io/circt-synth-tracker/pr-benchmarks/pr-9949/report.html and there are improvements in sub results 👍 (also there was no translation validation failure, the logic is certainly sound).

The code looks clean and makes sense.
I'm also curios what lowering pattern would improve the benchmark like https://github.com/pulp-platform/ELAU/blob/master/src/AddSubC.sv

auto width = op.getType().getIntOrFloatBitWidth();
// Implement a naive Ripple-carry full adder.
Value carry;
Value carry = carryIn; // null Value → carry_in = 0
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally please use ASCII

Suggested change
Value carry = carryIn; // null Value carry_in = 0
Value carry = carryIn; // null Value -> carry_in = 0

Comment on lines +797 to +815
auto arch = determineAdderArch(op, width);
if (arch == AdderArchitecture::RippleCarry)
return lowerRippleCarryAdder(op, operands[0], operands[1], carryIn,
rewriter);
return lowerParallelPrefixAdder(op, operands[0], operands[1], carryIn,
rewriter);
}
return failure(); // 3-input add without const_1 — handled elsewhere
}

// Lower only when there are two inputs.
// Variadic operands must be lowered in a different pattern.
if (inputs.size() != 2)
return failure();

auto arch = determineAdderArch(op, width);
if (arch == AdderArchitecture::RippleCarry)
return lowerRippleCarryAdder(op, inputs, rewriter);
return lowerParallelPrefixAdder(op, inputs, rewriter);
return lowerRippleCarryAdder(op, inputs[0], inputs[1], Value(), rewriter);
return lowerParallelPrefixAdder(op, inputs[0], inputs[1], Value(), rewriter);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you create a helper function for this part and share at two places?

    auto arch = determineAdderArch(op, width);
    if (arch == AdderArchitecture::RippleCarry)
      return lowerRippleCarryAdder(op, inputs[0], inputs[1], Value(), rewriter);
    return lowerParallelPrefixAdder(op, inputs[0], inputs[1], Value(), rewriter);

// This pattern comes from compress(a, b, const_1) in DatapathToComb, which
// arises from sub → add(a, ~b, 1). Detect const_1 in any operand position.
if (inputs.size() == 3) {
for (unsigned i = 0; i < 3; ++i) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because comb::AddOp and datapath::CompressOp have commutative trait, usually constant is pushed to the last element. I think it's fine to check only the last element.
https://github.com/llvm/llvm-project/blob/3d421d59ad247afeadf5d4f886c9dea14a5eb229/mlir/docs/Canonicalization.md?plain=1#L117-L118

Copy link
Copy Markdown
Contributor

@cowardsa cowardsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR - really very clean for a first PR so great job!

Obviously would be great to add FileCheck tests and circt-lec tests

A nice follow-up PR will be to handle the case when you've not got a constant but you've got a variable bit - this will be more difficult because it'll be zero-extended to match the operator's width

addends.push_back(
extractBits(rewriter, input)); // Extract bits from each input
// Special case: compress(a, b, const_1) with exactly 3 inputs where one
// input is the constant 1. This pattern arises from sub → add(a, ~b, 1).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ASCII

Comment on lines +87 to +89
auto constOp = inputs[i].getDefiningOp<hw::ConstantOp>();
if (!constOp || !constOp.getValue().isOne())
continue;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above constOp should be found as the final argument

// replace compress + add together with comb.add(a, b, const_1) so the
// downstream prefix adder can bake carry_in=1 directly into g[0].
if (inputs.size() == 3) {
for (unsigned i = 0; i < 3; ++i) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can remove this outer loop when we know where the final argument is

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants