Skip to content

Fix optimising out returndatacopy call.#16436

Closed
rodiazet wants to merge 5 commits intodevelopfrom
fix-opt-returndatacopy
Closed

Fix optimising out returndatacopy call.#16436
rodiazet wants to merge 5 commits intodevelopfrom
fix-opt-returndatacopy

Conversation

@rodiazet
Copy link
Copy Markdown
Contributor

@rodiazet rodiazet commented Feb 2, 2026

Fixes removing returndatacopy calls when returndatasize is stale.

This problem is fixed by registering returndatasize calls and invalidating them when any of call opcode (potentially changing the size) is called.

@cameel cameel added the codegen error Compiler generates invalid code. Critical. label Feb 3, 2026
@cameel
Copy link
Copy Markdown
Collaborator

cameel commented Feb 3, 2026

WiP. Need more tests.

And a changelog entry. And a buglist entry. Generally an integral part of the task is to determine the actual impact of the bug on the compiler and do a deeper validation of the claims from the submission (which we have confirmed but only superficially investigated so far).

@github-actions github-actions bot added the stale The issue/PR was marked as stale because it has been open for too long. label Feb 18, 2026
@cameel
Copy link
Copy Markdown
Collaborator

cameel commented Feb 20, 2026

@rodiazet Since this one seems simpler to fix than #16459 and was reported earlier I think it would make sense to focus on finishing it first. I'd like to get it into the next release if possible.

@github-actions github-actions bot removed the stale The issue/PR was marked as stale because it has been open for too long. label Feb 21, 2026
@rodiazet
Copy link
Copy Markdown
Contributor Author

rodiazet commented Feb 23, 2026

Actually this one can be fixed also by improving SSAValueTracker, to make it aware of the fact that calls invalidate returndatasize(), but it would require also registering all the returndatasize() calls with all variables which depends on them, directly and indirectly, which probably would be inefficient.

@rodiazet rodiazet force-pushed the fix-opt-returndatacopy branch 3 times, most recently from 14a99af to 8060165 Compare February 24, 2026 10:30
@rodiazet rodiazet marked this pull request as ready for review February 24, 2026 10:31
Comment on lines +109 to +110
registerReturnDataSizeCall(_functionCall);
invalidateReturnDataSize(_functionCall);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just shove this into one function and call it validateReturnDataSizeCall(...); it looks very odd at first glance to register and then immediately after invalidate.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Initially I thought they gonna be called in different places.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redone.

Changelog.md Outdated
Compiler Features:

Bugfixes:
* Yul Optimizer: Fix ``UnusedStoreEliminator`` incorrectly removing ``returndatacopy`` operations when the length comes from a stale ``returndatasize()`` call that was invalidated by subsequent call opcodes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Yul Optimizer: Fix ``UnusedStoreEliminator`` incorrectly removing ``returndatacopy`` operations when the length comes from a stale ``returndatasize()`` call that was invalidated by subsequent call opcodes.
* Yul Optimizer: Fix `UnusedStoreEliminator` incorrectly removing `returndatacopy` operations when the length comes from a stale `returndatasize()` call that was invalidated by subsequent call opcodes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't commit suggestions from you any longer.


// Contains valid return data size calls. Call is valid until any of invalidating return data size instruction is
// called. Currently, they are CALL, STATICCALL or DELEGATECALL.
std::set<FunctionCall const*> m_validReturnDataSizeCalls;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a boolean.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it cannot be a boolean. We need to track which returndatasize calls are still valid. Only these are valid which are called after last invalidating call.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they're visited in sequential order, then a boolean will be perfectly fine. @cameel what's the visitation order here?

Copy link
Copy Markdown
Contributor Author

@rodiazet rodiazet Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone can produce code like this or similar in inline assembly. Or an optimiser step can create it.

let s := returndatasize();
staticcall(...)
let s1 := returndatasize();
returndatacopy(..., s) // This cannot be removed
returndatacopy(..., s1) // This can be removed.

Copy link
Copy Markdown
Collaborator

@cameel cameel Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the fact that visitation order is linear does not help, because you can refer to older calls that are no longer valid.

But I'm not sure the current solution is even enough. How does it account for loops and branching?

    let s := returndatasize()
    for { let i := 0 } lt(i, 10) { i := add(i, 1) } {
        returndatacopy(..., s)
        staticcall(...)
    }

Here s is valid but only in the first iteration of the loop. Which means that we cannot remove the returndatacopy(), but a linear scan not accounting for loops will miss that.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW. Someone wrote that this issue looks simpler....... :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider yourself lucky Daniel isn't here, he would have called it trivial.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optimizing our returndatacopy in unused store eliminator is a bit weird anyway. It was likely the most convenient place to add such an optimization and then it stuck.

No, this is fully within the its scope:

Optimiser component that removes stores to memory and storage slots that are not used or overwritten later on.

Store does not specify whether this is about memory or storage. Storing the result in memory is inherently a part of the call operation.

Copy link
Copy Markdown
Collaborator

@cameel cameel Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @rodiazet that a dedicated Yul builtin would be the most elegant solution to the general problem. We could basically fix the underlying design choice by having the builtin either always copy all returndata or cap the length to no more than returndatasize(). But I also think that this goes beyond the scope of a mere fix and I would not do that here. Feel free to report an issue for it though, it's worth considering as a feature.

@nikola-matic has a point that the current solution is brittle and it would better not to have to do this. We're risking introducing another bug if we don't properly account for all cases and that's tricky. My suggestion would be to try removing this optimization completely (in a new PR) and see how much effect that has. I suspect the optimization is actually largely ineffective and we don't need it.

I checked the code and found only a handful of situations where codegen is using returndatacopy():

  1. Explicit use in inline assembly.
  2. Copying the result of external calls that return dynamic types (via IRGeneratorForStatements::appendExternalFunctionCall).
  3. Bare external calls (via YulUtilFunctions::extractReturndataFunction())
  4. Checking bits of returndata in try/catch (selector, panic code, etc.)
  5. Forwarding reverts.
    • In this case returndata is always used so the optimization does not apply.
  6. The corresponding cases in evmasm codegen.
    • These don't go through Yul optimizer.
  7. EOF precompile calls.
    • These use hard-coded length.

Of the above, only 1-3 can really trigger the optimization. The problem is that in practice returndatacopy() for external calls is likely never removed anyway, because without memory objects we are unable to properly determine that the copied data is unused.

For example, if you compile this, you can still see returndatacopy() in the optimized IR output:

contract C {
    function f() external returns (uint[] memory) {}

    function testExternalCall() public {
        this.f();
    }

    function testBareCall() public {
        address(this).call("");
    }
}

Relevant part of testBareCall():

                    case 0x3ae2d4f4 {
                            // ...
                            data := memPtr
                            returndatacopy(add(memPtr, 0x20), 0, returndatasize())
                        }
                        return(0, 0)
                    }

Relevant part of testExternalCall():

                    case 0x82168021 {
                            // ...
                            let _5 := returndatasize()
                            returndatacopy(_3, 0, _5)
                            finalize_allocation(_3, _5)
                            // ...
                            let _7 := add(_3, offset)
                            // ...
                            let src := add(_7, 32)
                            for { } lt(src, srcEnd) { src := add(src, 32) }
                            {
                                mstore(dst, mload(src))
                                dst := add(dst, 32)
                            }
                        }
                        return(0, 0)
                    }

This means that this optimization does not really work for anything other than hand-crafted inline assembly. The comment in the implementation explicitly says that it was not aimed at that use case though.

Removing it will probably not change anything substantial. Our short-term plan for external calls is to optimize memory use at codegen level anyway. We will detect that the result is unused and not generate returndatacopy() in the first place.

Copy link
Copy Markdown
Contributor Author

@rodiazet rodiazet Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Agree on removing this optimisation piece. Pushed the PR and the issue is already created.

@rodiazet rodiazet force-pushed the fix-opt-returndatacopy branch from 4e4e511 to 477266c Compare February 25, 2026 13:01
@argotorg argotorg deleted a comment from github-actions bot Mar 5, 2026
ExpressionJoiner::run(*m_context, block);
return block;
}},
{"unusedStoreEliminatorNoSsaTransform", [&]() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{"unusedStoreEliminatorNoSsaTransform", [&]() {
{"unusedStoreEliminatorNoSSA", [&]() {

ForLoopInitRewriter::run(*m_context, block);
ExpressionSplitter::run(*m_context, block);
UnusedStoreEliminator::run(*m_context, block);
SSAReverser::run(*m_context, block);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're not running SSATransform, you don't need to use SSAReverser to get it back to non-SSA either.

Comment on lines +447 to +452
if (
*instruction == evmasm::Instruction::STATICCALL ||
*instruction == evmasm::Instruction::CALL ||
*instruction == evmasm::Instruction::DELEGATECALL ||
*instruction == evmasm::Instruction::CALLCODE
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something that will need to be kept up to date as new EVM instructions are added (for example EOF would definitely have affected this). We should define a function for it in SemanticInformation.h. I'd call it invalidatesReturndata().

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's also likely that we already have such a check hard-coded somewhere in the compiler. Please check. If we do, those other places should use the new helper too.

@github-actions
Copy link
Copy Markdown

This pull request is stale because it has been open for 14 days with no activity.
It will be closed in 7 days unless the stale label is removed.

@github-actions github-actions bot added the stale The issue/PR was marked as stale because it has been open for too long. label Mar 24, 2026
@github-actions
Copy link
Copy Markdown

This pull request was closed due to a lack of activity for 7 days after it was stale.

@nikola-matic
Copy link
Copy Markdown
Contributor

@rodiazet this can stay closed right - we're getting rid of the returndatacopy optimization completely here, right?

@rodiazet
Copy link
Copy Markdown
Contributor Author

rodiazet commented Apr 1, 2026

Yes. I wanted to close it after the removal PR is merged, but if it's closed already it does not make sense to reopen it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

closed-due-inactivity codegen error Compiler generates invalid code. Critical. stale The issue/PR was marked as stale because it has been open for too long.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants