Fix optimising out returndatacopy call.#16436
Conversation
And a changelog entry. And a buglist entry. Generally an integral part of the task is to determine the actual impact of the bug on the compiler and do a deeper validation of the claims from the submission (which we have confirmed but only superficially investigated so far). |
|
Actually this one can be fixed also by improving |
14a99af to
8060165
Compare
| registerReturnDataSizeCall(_functionCall); | ||
| invalidateReturnDataSize(_functionCall); |
There was a problem hiding this comment.
Just shove this into one function and call it validateReturnDataSizeCall(...); it looks very odd at first glance to register and then immediately after invalidate.
There was a problem hiding this comment.
Right. Initially I thought they gonna be called in different places.
Changelog.md
Outdated
| Compiler Features: | ||
|
|
||
| Bugfixes: | ||
| * Yul Optimizer: Fix ``UnusedStoreEliminator`` incorrectly removing ``returndatacopy`` operations when the length comes from a stale ``returndatasize()`` call that was invalidated by subsequent call opcodes. |
There was a problem hiding this comment.
| * Yul Optimizer: Fix ``UnusedStoreEliminator`` incorrectly removing ``returndatacopy`` operations when the length comes from a stale ``returndatasize()`` call that was invalidated by subsequent call opcodes. | |
| * Yul Optimizer: Fix `UnusedStoreEliminator` incorrectly removing `returndatacopy` operations when the length comes from a stale `returndatasize()` call that was invalidated by subsequent call opcodes. |
There was a problem hiding this comment.
I don't commit suggestions from you any longer.
|
|
||
| // Contains valid return data size calls. Call is valid until any of invalidating return data size instruction is | ||
| // called. Currently, they are CALL, STATICCALL or DELEGATECALL. | ||
| std::set<FunctionCall const*> m_validReturnDataSizeCalls; |
There was a problem hiding this comment.
This could be a boolean.
There was a problem hiding this comment.
I think it cannot be a boolean. We need to track which returndatasize calls are still valid. Only these are valid which are called after last invalidating call.
There was a problem hiding this comment.
If they're visited in sequential order, then a boolean will be perfectly fine. @cameel what's the visitation order here?
There was a problem hiding this comment.
Someone can produce code like this or similar in inline assembly. Or an optimiser step can create it.
let s := returndatasize();
staticcall(...)
let s1 := returndatasize();
returndatacopy(..., s) // This cannot be removed
returndatacopy(..., s1) // This can be removed.
There was a problem hiding this comment.
Yeah, the fact that visitation order is linear does not help, because you can refer to older calls that are no longer valid.
But I'm not sure the current solution is even enough. How does it account for loops and branching?
let s := returndatasize()
for { let i := 0 } lt(i, 10) { i := add(i, 1) } {
returndatacopy(..., s)
staticcall(...)
}Here s is valid but only in the first iteration of the loop. Which means that we cannot remove the returndatacopy(), but a linear scan not accounting for loops will miss that.
There was a problem hiding this comment.
BTW. Someone wrote that this issue looks simpler....... :)
There was a problem hiding this comment.
Consider yourself lucky Daniel isn't here, he would have called it trivial.
There was a problem hiding this comment.
optimizing our returndatacopy in unused store eliminator is a bit weird anyway. It was likely the most convenient place to add such an optimization and then it stuck.
No, this is fully within the its scope:
Optimiser component that removes stores to memory and storage slots that are not used or overwritten later on.
Store does not specify whether this is about memory or storage. Storing the result in memory is inherently a part of the call operation.
There was a problem hiding this comment.
I agree with @rodiazet that a dedicated Yul builtin would be the most elegant solution to the general problem. We could basically fix the underlying design choice by having the builtin either always copy all returndata or cap the length to no more than returndatasize(). But I also think that this goes beyond the scope of a mere fix and I would not do that here. Feel free to report an issue for it though, it's worth considering as a feature.
@nikola-matic has a point that the current solution is brittle and it would better not to have to do this. We're risking introducing another bug if we don't properly account for all cases and that's tricky. My suggestion would be to try removing this optimization completely (in a new PR) and see how much effect that has. I suspect the optimization is actually largely ineffective and we don't need it.
I checked the code and found only a handful of situations where codegen is using returndatacopy():
- Explicit use in inline assembly.
- Copying the result of external calls that return dynamic types (via
IRGeneratorForStatements::appendExternalFunctionCall). - Bare external calls (via
YulUtilFunctions::extractReturndataFunction()) - Checking bits of returndata in
try/catch(selector, panic code, etc.)- These don't even match this pattern because they either hard-code the length or use an expression more complex than just
returndatasize(). SeereturnDataSelectorFunction(),tryDecodeErrorMessageFunction()andtryDecodePanicDataFunction()inYulUtilFunctions.
- These don't even match this pattern because they either hard-code the length or use an expression more complex than just
- Forwarding reverts.
- In this case returndata is always used so the optimization does not apply.
- The corresponding cases in evmasm codegen.
- These don't go through Yul optimizer.
- EOF precompile calls.
- These use hard-coded length.
Of the above, only 1-3 can really trigger the optimization. The problem is that in practice returndatacopy() for external calls is likely never removed anyway, because without memory objects we are unable to properly determine that the copied data is unused.
For example, if you compile this, you can still see returndatacopy() in the optimized IR output:
contract C {
function f() external returns (uint[] memory) {}
function testExternalCall() public {
this.f();
}
function testBareCall() public {
address(this).call("");
}
}Relevant part of testBareCall():
case 0x3ae2d4f4 {
// ...
data := memPtr
returndatacopy(add(memPtr, 0x20), 0, returndatasize())
}
return(0, 0)
}Relevant part of testExternalCall():
case 0x82168021 {
// ...
let _5 := returndatasize()
returndatacopy(_3, 0, _5)
finalize_allocation(_3, _5)
// ...
let _7 := add(_3, offset)
// ...
let src := add(_7, 32)
for { } lt(src, srcEnd) { src := add(src, 32) }
{
mstore(dst, mload(src))
dst := add(dst, 32)
}
}
return(0, 0)
}This means that this optimization does not really work for anything other than hand-crafted inline assembly. The comment in the implementation explicitly says that it was not aimed at that use case though.
Removing it will probably not change anything substantial. Our short-term plan for external calls is to optimize memory use at codegen level anyway. We will detect that the result is unused and not generate returndatacopy() in the first place.
4e4e511 to
477266c
Compare
| ExpressionJoiner::run(*m_context, block); | ||
| return block; | ||
| }}, | ||
| {"unusedStoreEliminatorNoSsaTransform", [&]() { |
There was a problem hiding this comment.
| {"unusedStoreEliminatorNoSsaTransform", [&]() { | |
| {"unusedStoreEliminatorNoSSA", [&]() { |
| ForLoopInitRewriter::run(*m_context, block); | ||
| ExpressionSplitter::run(*m_context, block); | ||
| UnusedStoreEliminator::run(*m_context, block); | ||
| SSAReverser::run(*m_context, block); |
There was a problem hiding this comment.
If you're not running SSATransform, you don't need to use SSAReverser to get it back to non-SSA either.
| if ( | ||
| *instruction == evmasm::Instruction::STATICCALL || | ||
| *instruction == evmasm::Instruction::CALL || | ||
| *instruction == evmasm::Instruction::DELEGATECALL || | ||
| *instruction == evmasm::Instruction::CALLCODE | ||
| ) |
There was a problem hiding this comment.
This is something that will need to be kept up to date as new EVM instructions are added (for example EOF would definitely have affected this). We should define a function for it in SemanticInformation.h. I'd call it invalidatesReturndata().
There was a problem hiding this comment.
I think it's also likely that we already have such a check hard-coded somewhere in the compiler. Please check. If we do, those other places should use the new helper too.
|
This pull request is stale because it has been open for 14 days with no activity. |
|
This pull request was closed due to a lack of activity for 7 days after it was stale. |
|
Yes. I wanted to close it after the removal PR is merged, but if it's closed already it does not make sense to reopen it. |
Fixes removing
returndatacopycalls whenreturndatasizeis stale.This problem is fixed by registering
returndatasizecalls and invalidating them when any of call opcode (potentially changing the size) is called.