-
Notifications
You must be signed in to change notification settings - Fork 303
[QIR] Insert array record call in sampling workflow #3553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
khalatepradnya
wants to merge
8
commits into
NVIDIA:main
Choose a base branch
from
khalatepradnya:insert-array-record
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
97cf83b
Introduce a new module pass to be called before QIR conversion that will
khalatepradnya c329683
Merge branch 'main' into insert-array-record
khalatepradnya 56d1e48
Merge branch 'main' into insert-array-record
khalatepradnya c774568
* Addressing review comments
khalatepradnya f056446
Merge branch 'main' into insert-array-record
khalatepradnya 9f345b9
Merge branch 'main' into insert-array-record
khalatepradnya 4b788fe
* Enable the pass in QIR workflow
khalatepradnya 99999e8
* Fix test setup - clear log in between tests
khalatepradnya File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,213 @@ | ||
| /******************************************************************************* | ||
| * Copyright (c) 2025 NVIDIA Corporation & Affiliates. * | ||
| * All rights reserved. * | ||
| * * | ||
| * This source code and the accompanying materials are made available under * | ||
| * the terms of the Apache License 2.0 which accompanies this distribution. * | ||
| ******************************************************************************/ | ||
|
|
||
| #include "PassDetails.h" | ||
| #include "cudaq/Optimizer/Builder/Intrinsics.h" | ||
| #include "cudaq/Optimizer/Builder/Runtime.h" | ||
| #include "cudaq/Optimizer/CodeGen/Passes.h" | ||
| #include "cudaq/Optimizer/CodeGen/QIRAttributeNames.h" | ||
| #include "cudaq/Optimizer/CodeGen/QIRFunctionNames.h" | ||
| #include "cudaq/Optimizer/Dialect/Quake/QuakeOps.h" | ||
| #include "llvm/ADT/SmallSet.h" | ||
| #include "mlir/Transforms/GreedyPatternRewriteDriver.h" | ||
| #include "mlir/Transforms/Passes.h" | ||
|
|
||
| namespace cudaq::opt { | ||
| #define GEN_PASS_DEF_QIRINSERTARRAYRECORD | ||
| #include "cudaq/Optimizer/CodeGen/Passes.h.inc" | ||
| } // namespace cudaq::opt | ||
|
|
||
| #define DEBUG_TYPE "qir-insert-array-record" | ||
|
|
||
| using namespace mlir; | ||
|
|
||
| namespace { | ||
|
|
||
| // Trace a pointer to back to its corresponding `AllocaOp` | ||
| static cudaq::cc::AllocaOp tracePointerToAlloca(Value ptr) { | ||
| llvm::DenseSet<Value> visited; | ||
| while (ptr) { | ||
| if (!visited.insert(ptr).second) | ||
| return {}; | ||
| Operation *defOp = ptr.getDefiningOp(); | ||
| if (!defOp) | ||
| return {}; | ||
| if (auto allocaOp = dyn_cast<cudaq::cc::AllocaOp>(defOp)) | ||
| return allocaOp; | ||
| if (auto castOp = dyn_cast<cudaq::cc::CastOp>(defOp)) { | ||
| ptr = castOp.getValue(); | ||
| continue; | ||
| } | ||
| if (auto computePtrOp = dyn_cast<cudaq::cc::ComputePtrOp>(defOp)) { | ||
| ptr = computePtrOp.getBase(); | ||
| continue; | ||
| } | ||
| return {}; | ||
| } | ||
| return {}; | ||
| } | ||
|
|
||
| // Walk a function to identify all the measure-discriminate-store patterns and | ||
| // collect the associated `AllocaOp` when the measurement results are stored. | ||
| // Collect only unique AllocaOps - since each may correspond to multiple | ||
| // measurement operations. When there are no explicit stores, track the first | ||
| // measurement operation and the get the total number of measurements. | ||
| struct AllocaMeasureStoreAnalysis { | ||
| AllocaMeasureStoreAnalysis() = default; | ||
|
|
||
| explicit AllocaMeasureStoreAnalysis(func::FuncOp funcOp) { | ||
| size_t totalMeasurementCount = 0; | ||
| Operation *firstMeasureOp = nullptr; | ||
| DenseMap<Value, Operation *> valueToMeasurement; | ||
| llvm::SetVector<cudaq::cc::AllocaOp> uniqueAllocaOps; | ||
|
|
||
| // First pass: identify measurements and propagate through uses | ||
| funcOp.walk([&](Operation *op) { | ||
| if (op->hasTrait<cudaq::QuantumMeasure>()) { | ||
| if (op->hasAttr(cudaq::opt::ResultIndexAttrName)) { | ||
| totalMeasurementCount++; | ||
| if (!firstMeasureOp) | ||
| firstMeasureOp = op; | ||
| } | ||
| for (auto result : op->getResults()) | ||
| valueToMeasurement[result] = op; | ||
| return WalkResult::advance(); | ||
| } | ||
|
|
||
| // TODO: Check if more operations need to be added here. | ||
| if (!isa<quake::DiscriminateOp, cudaq::cc::CastOp>(op)) { | ||
| return WalkResult::advance(); | ||
| } | ||
|
|
||
| // Find the operands derived from measurements | ||
| for (auto operand : op->getOperands()) { | ||
| if (valueToMeasurement.count(operand)) { | ||
| for (auto result : op->getResults()) | ||
| valueToMeasurement[result] = valueToMeasurement[operand]; | ||
| } | ||
| break; // Checking one operand is enough | ||
| } | ||
| return WalkResult::advance(); | ||
| }); | ||
|
|
||
| // Second pass: find stores of measurement values and trace to `alloca` ops | ||
| funcOp.walk([&](cudaq::cc::StoreOp storeOp) { | ||
| if (valueToMeasurement.count(storeOp.getValue())) { | ||
| Value ptr = storeOp.getPtrvalue(); | ||
| auto allocaOp = tracePointerToAlloca(ptr); | ||
| if (allocaOp) | ||
| uniqueAllocaOps.insert(allocaOp); | ||
| } | ||
| }); | ||
|
|
||
| if (!uniqueAllocaOps.empty()) { | ||
| // Use array sizes when explicit storage exists | ||
| for (auto allocaOp : uniqueAllocaOps) { | ||
| if (auto arrType = | ||
| allocaOp.getElementType().dyn_cast<cudaq::cc::ArrayType>()) { | ||
| arraySize += arrType.getSize(); | ||
| } else { | ||
| arraySize += 1; | ||
| } | ||
| } | ||
| allocaOps.append(uniqueAllocaOps.begin(), uniqueAllocaOps.end()); | ||
| } else if (totalMeasurementCount > 0) { | ||
| // This could be individual qubit(s) | ||
| arraySize = totalMeasurementCount; | ||
| firstMeasurementOp = firstMeasureOp; | ||
| } | ||
| } | ||
|
|
||
| SmallVector<cudaq::cc::AllocaOp> allocaOps; | ||
| size_t arraySize = 0; | ||
| Operation *firstMeasurementOp = nullptr; | ||
| }; | ||
|
|
||
| // Inserts a QIR array record output call to declare measurement result storage. | ||
| // QIR requires `__quantum__rt__array_record_output()` be called before multiple | ||
| // measurements to declare the output array size and type label. This is | ||
| // required in `sample` API since it always returns a vector of measurement | ||
| // results. Following logic is used to determine the insertion point: | ||
| // 1. After first alloca (if explicit array storage exists) | ||
| // 2. Before first measurement (if no explicit storage) | ||
| // The label string is created as "array<i1 x N>" where N is the total number of | ||
| // measurement results. The array record output call is created as: | ||
| // `__quantum__rt__array_record_output(N, label);` | ||
| LogicalResult | ||
| insertArrayRecordingCalls(func::FuncOp funcOp, size_t resultCount, | ||
| const SmallVector<cudaq::cc::AllocaOp> &allocaOps, | ||
| Operation *firstMeasureOp) { | ||
| if (resultCount == 0) | ||
| return success(); | ||
|
|
||
| auto ctx = funcOp.getContext(); | ||
| OpBuilder builder(ctx); | ||
| mlir::Location loc = funcOp.getLoc(); | ||
| // We insert only one array record call | ||
| if (!allocaOps.empty()) | ||
| builder.setInsertionPointAfter(allocaOps[0]); | ||
| else if (firstMeasureOp) | ||
| builder.setInsertionPoint(firstMeasureOp); | ||
| else | ||
| return failure(); | ||
|
|
||
| // Create the label string: "array<i1 x N>" | ||
| std::string labelStr = "array<i1 x " + std::to_string(resultCount) + ">"; | ||
| auto strLitTy = cudaq::cc::PointerType::get(cudaq::cc::ArrayType::get( | ||
| builder.getContext(), builder.getI8Type(), labelStr.size() + 1)); | ||
| Value lit = builder.create<cudaq::cc::CreateStringLiteralOp>( | ||
| loc, strLitTy, builder.getStringAttr(labelStr)); | ||
| auto i8PtrTy = cudaq::cc::PointerType::get(builder.getI8Type()); | ||
| Value label = builder.create<cudaq::cc::CastOp>(loc, i8PtrTy, lit); | ||
| Value size = builder.create<arith::ConstantIntOp>(loc, resultCount, 64); | ||
| builder.create<func::CallOp>(loc, TypeRange{}, | ||
| cudaq::opt::QIRArrayRecordOutput, | ||
| ArrayRef<Value>{size, label}); | ||
|
|
||
| // Add the declaration to the module if it doesn't already exist | ||
| auto module = funcOp->getParentOfType<ModuleOp>(); | ||
| if (!module.lookupSymbol(cudaq::opt::QIRArrayRecordOutput)) { | ||
| auto irBuilder = cudaq::IRBuilder::atBlockEnd(module.getBody()); | ||
| if (failed(irBuilder.loadIntrinsic(module, | ||
| cudaq::opt::QIRArrayRecordOutput))) { | ||
| return failure(); | ||
| } | ||
| } | ||
| return success(); | ||
| } | ||
|
|
||
| struct QirInsertArrayRecordPass | ||
| : public cudaq::opt::impl::QirInsertArrayRecordBase< | ||
| QirInsertArrayRecordPass> { | ||
|
|
||
| using QirInsertArrayRecordBase::QirInsertArrayRecordBase; | ||
|
|
||
| void runOnOperation() override { | ||
| ModuleOp module = getOperation(); | ||
| for (auto funcOp : module.getOps<func::FuncOp>()) { | ||
| if (!funcOp || funcOp.empty() || | ||
| !funcOp->hasAttr(cudaq::entryPointAttrName) || | ||
| funcOp->hasAttr(cudaq::runtime::enableCudaqRun)) | ||
| continue; | ||
|
|
||
| AllocaMeasureStoreAnalysis analysis(funcOp); | ||
| if (analysis.arraySize == 0) | ||
| continue; | ||
|
|
||
| LLVM_DEBUG(llvm::dbgs() << "Before adding array recording call:\n" | ||
| << *funcOp); | ||
| if (failed(insertArrayRecordingCalls(funcOp, analysis.arraySize, | ||
| analysis.allocaOps, | ||
| analysis.firstMeasurementOp))) | ||
| return signalPassFailure(); | ||
| LLVM_DEBUG(llvm::dbgs() << "After adding array recording call:\n" | ||
| << *funcOp); | ||
| } | ||
| } | ||
| }; | ||
| } // namespace | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use the intrinsic loading instead of rolling your own. Aren't these already loaded by the QIR API Prep pass, though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to make this pass self-contained, since we cannot guarantee that the prep pass has run before this one.
Should I move the
loadIntrinsiccall from line#176 here?