Fix negative calldata offset validations#16522
Fix negative calldata offset validations#16522cameel wants to merge 2 commits intoreorganize-calldata-semantic-testsfrom
Conversation
|
Note that this is still work in progress. For now it managed to add coverage that detects the inconsistencies and track one of the bugs (the one that caused evmasm and IR differences). The fix for that bug is still incomplete. The other inconsistencies, especially ones resulting from not using signed comparisons are not yet addressed. |
There was a problem hiding this comment.
The _abi_encode.sol test variants do not add much. I will probably remove them unless someone thinks they are are worth keeping.
Each one has a corresponding, slightly simpler test that just returns the value as calldata. I wrote the _abi_encode.sol ones first and only later realized that returning calldata will not trigger decoding either.
| } | ||
| m_context.callYulFunction( | ||
| m_context.utilFunctions().conversionFunction(typeOnStack, targetType), | ||
| typeOnStack.isDynamicallySized() ? 2 : 1, |
There was a problem hiding this comment.
This is the fix for the first bug. The problem here is that the evmasm codegen is sometimes still using code written for ABI coder v1 even when v2 is selected. This is fine in terms of correctness of decoding, but does not account for the validations, since most of them were only introduced in v2.
This will happen for example when you try to access a bytes element of bytes[] calldata. Note that this is not even a v1 type, but the routine kicks in, because we're only accessing bytes inside it and that one is.
I'm not yet sure this change (i.e. replacing everything with a call to v2) is the right solution here. It will likely significantly increase the cost and still needs adjustment for types with a different number of slots (assertions are failing in tests). It at least fix the validations though.
| // accessTop(bytes[][][]): 0x20, 1, 0x20, 1, 0x20, 1, 0x20, 1, "\xff" -> 0x20, 1, 0x20, 1, 0x20, 1, 0x20, 1, "\xff" | ||
| // accessTop(bytes[][][]): 0x20, 1, 0x20, 1, 0x20, 1, -32 -> FAILURE, hex"08c379a0", 0x20, 43, "ABI decoding: invalid calldata a", "rray offset" | ||
| // accessTop(bytes[][][]): 0x20, 1, 0x20, 1, -32 -> FAILURE, hex"08c379a0", 0x20, 43, "ABI decoding: invalid calldata a", "rray offset" | ||
| // accessTop(bytes[][][]): 0x20, 1, -32 -> FAILURE, hex"08c379a0", 0x20, 43, "ABI decoding: invalid calldata a", "rray offset" | ||
|
|
||
| // accessMiddle(bytes[][][]): 0x20, 1, 0x20, 1, 0x20, 1, 0x20, 1, "\xff" -> 0x20, 1, 0x20, 1, 0x20, 1, "\xff" | ||
| // accessMiddle(bytes[][][]): 0x20, 1, 0x20, 1, 0x20, 1, -32 -> FAILURE, hex"08c379a0", 0x20, 43, "ABI decoding: invalid calldata a", "rray offset" | ||
| // accessMiddle(bytes[][][]): 0x20, 1, 0x20, 1, -32 -> FAILURE, hex"08c379a0", 0x20, 43, "ABI decoding: invalid calldata a", "rray offset" | ||
| // accessMiddle(bytes[][][]): 0x20, 1, -32 -> FAILURE, hex"08c379a0", 0x20, 43, "ABI decoding: invalid calldata a", "rray offset" | ||
|
|
||
| // accessBottom(bytes[][][]): 0x20, 1, 0x20, 1, 0x20, 1, 0x20, 1, "\xff" -> 0x20, 1, 0x20, 1, "\xff" | ||
| // accessBottom(bytes[][][]): 0x20, 1, 0x20, 1, 0x20, 1, -32 -> FAILURE, hex"08c379a0", 0x20, 43, "ABI decoding: invalid calldata a", "rray offset" | ||
| // accessBottom(bytes[][][]): 0x20, 1, 0x20, 1, -32 -> FAILURE, hex"08c379a0", 0x20, 43, "ABI decoding: invalid calldata a", "rray offset" | ||
| // accessBottom(bytes[][][]): 0x20, 1, -32 -> FAILURE, hex"08c379a0", 0x20, 43, "ABI decoding: invalid calldata a", "rray offset" | ||
|
|
||
| // accessContent(bytes[][][]): 0x20, 1, 0x20, 1, 0x20, 1, 0x20, 1, "\xff" -> 0x20, 1, "\xff" | ||
| // accessContent(bytes[][][]): 0x20, 1, 0x20, 1, 0x20, 1, -32 -> 0x20, 1, "\xff" | ||
| // accessContent(bytes[][][]): 0x20, 1, 0x20, 1, -32 -> 0x20, 1, "\xff" | ||
| // accessContent(bytes[][][]): 0x20, 1, -32 -> 0x20, 1, "\xff" |
There was a problem hiding this comment.
This is the second bug. The input has negative offsets, but still pointing within calldata. None of these should result in a revert.
| // accessTop(bytes[][][]): 0x20, 1, 0x20, 1, 0x20, 1, 0x20, 0 -> 0x20, 1, 0x20, 1, 0x20, 1, 0x20, 0 | ||
| // accessTop(bytes[][][]): 0x20, 1, 0x20, 1, 0x20, 1, -999 -> 0x20, 1, 0x20, 1, 0x20, 1, 0x20, 0 | ||
| // accessTop(bytes[][][]): 0x20, 1, 0x20, 1, -999 -> 0x20, 1, 0x20, 1, 0x20, 0 | ||
| // accessTop(bytes[][][]): 0x20, 1, -999 -> 0x20, 1, 0x20, 0 | ||
| // accessTop(bytes[][][]): -999 -> FAILURE, hex"08c379a0", 0x20, 34, "ABI decoding: invalid tuple offs", "et" | ||
|
|
||
| // accessMiddle(bytes[][][]): 0x20, 1, 0x20, 1, 0x20, 1, 0x20, 0 -> 0x20, 1, 0x20, 1, 0x20, 0 | ||
| // accessMiddle(bytes[][][]): 0x20, 1, 0x20, 1, 0x20, 1, -999 -> 0x20, 1, 0x20, 1, 0x20, 0 | ||
| // accessMiddle(bytes[][][]): 0x20, 1, 0x20, 1, -999 -> 0x20, 1, 0x20, 0 | ||
| // accessMiddle(bytes[][][]): 0x20, 1, -999 -> 0x20, 0 | ||
| // accessMiddle(bytes[][][]): -999 -> FAILURE, hex"08c379a0", 0x20, 34, "ABI decoding: invalid tuple offs", "et" | ||
|
|
||
| // accessBottom(bytes[][][]): 0x20, 1, 0x20, 1, 0x20, 1, 0x20, 0 -> 0x20, 1, 0x20, 0 | ||
| // accessBottom(bytes[][][]): 0x20, 1, 0x20, 1, 0x20, 1, -999 -> 0x20, 1, 0x20, 0 | ||
| // accessBottom(bytes[][][]): 0x20, 1, 0x20, 1, -999 -> 0x20, 0 | ||
| // accessBottom(bytes[][][]): 0x20, 1, -999 -> FAILURE, hex"4e487b71", 0x32 | ||
| // accessBottom(bytes[][][]): -999 -> FAILURE, hex"08c379a0", 0x20, 34, "ABI decoding: invalid tuple offs", "et" | ||
|
|
||
| // accessContent(bytes[][][]): 0x20, 1, 0x20, 1, 0x20, 1, 0x20, 0 -> 0x20, 0 | ||
| // accessContent(bytes[][][]): 0x20, 1, 0x20, 1, 0x20, 1, -999 -> 0x20, 0 | ||
| // accessContent(bytes[][][]): 0x20, 1, 0x20, 1, -999 -> FAILURE, hex"4e487b71", 0x32 | ||
| // accessContent(bytes[][][]): 0x20, 1, -999 -> FAILURE, hex"4e487b71", 0x32 | ||
| // accessContent(bytes[][][]): -999 -> FAILURE, hex"08c379a0", 0x20, 34, "ABI decoding: invalid tuple offs", "et" |
There was a problem hiding this comment.
This is essentially the bug from the report. The output here should match negative_offset_negative_tail_position_dynamic_array.sol, i.e. every case except the first one in each group (which is there only to show what the output looks like in the equivalent non-negative-offset case) should revert.
| // stack: <mem start> <source ref> (variably sized) <length> <mem data pos> | ||
| if (targetType.baseType()->isValueType()) | ||
| { | ||
| copyToStackTop(2 + stackSize, stackSize); | ||
| ArrayUtils(m_context).copyArrayToMemory(typeOnStack); | ||
| } | ||
| else | ||
| { | ||
| m_context << u256(0) << Instruction::SWAP1; | ||
| // stack: <mem start> <source ref> (variably sized) <length> <counter> <mem data pos> |
There was a problem hiding this comment.
The else here is actually yet another case that I only discovered while doing the fix - I don't have test coverage for it yet. It's the case with a dynamic array with statically-encoded elements that are not value types. E.g. S[] calldata, where S is a static struct.
| // ==== | ||
| // revertStrings: debug | ||
| // ---- | ||
| // accessAndReturn(((uint256,bytes,uint256))): 32, -100 -> FAILURE, hex"08c379a0", 0x20, 39, "ABI decoding: invalid byte array", " length" |
There was a problem hiding this comment.
There is a fourth bug. This one may result in accepting input that will result in reading beyond calldatasize(). This one is not covered by tests yet, but the problem is the faulty condition that produces a revert with this message.
The condition is:
solidity/libsolidity/codegen/ABIFunctions.cpp
Line 1291 in a99b6d8
Firstly, the error message is inaccurate. Note that it's not only invalid length that may cause this. It can also be the offset. If the offset is negative enough, the absolute start position ends up past the end due to underflow and the (unsigned) comparison fails.
However, when src is negative, but src + length still positive, the condition may be satisfied. This is wrong, because the initial part of the string is likely past calldatasize() in that case.
682dc66 to
13dfe30
Compare
d45a45e to
0df51f7
Compare
…ys with statically-encoded base type
0df51f7 to
8b17183
Compare
13dfe30 to
20ef674
Compare
|
This pull request is stale because it has been open for 14 days with no activity. |
Depends on #16483.
Fixes inconsistent offset validations in ABI coder v2. These sometimes miss cases that result in reading past
calldatasize()and sometimes reject ones where negative offsets do not cause that and are therefore valid.This is not a single bug, but rather a combination of multiple things. In some cases behavior even differs between evmasm and IR pipeline.