Decompiler: Add builtin_memset optimizations#9232
Conversation
|
Interesting PR! I tried this on a debug xml generated from the sample provided in issue #9230: issue_9230_inlined_memcpy.xml undefined8 main(void)
{
char local_68 [12];
undefined4 uStack_5c;
undefined8 local_58;
undefined8 uStack_50;
undefined8 local_48;
undefined8 uStack_40;
undefined8 local_38;
undefined8 uStack_30;
undefined8 local_28;
undefined8 uStack_20;
undefined8 local_18;
undefined8 uStack_10;
undefined4 local_8;
local_18 = 0;
uStack_10 = 0;
local_28 = 0;
uStack_20 = 0;
local_38 = 0;
uStack_30 = 0;
local_48 = 0;
uStack_40 = 0;
local_58 = 0;
uStack_50 = 0;
local_68[0] = '\0';
local_68[1] = '\0';
local_68[2] = '\0';
local_68[3] = '\0';
local_68[4] = '\0';
local_68[5] = '\0';
local_68[6] = '\0';
local_68[7] = '\0';
local_68[8] = '\0';
local_68[9] = '\0';
local_68[10] = '\0';
local_68[0xb] = '\0';
uStack_5c = 0;
local_8 = 0;
printf(local_68);
local_18 = 0xaaaaaaaaaaaaaaaa;
uStack_10 = 0xaaaaaaaaaaaaaaaa;
local_28 = 0xaaaaaaaaaaaaaaaa;
uStack_20 = 0xaaaaaaaaaaaaaaaa;
local_38 = 0xaaaaaaaaaaaaaaaa;
uStack_30 = 0xaaaaaaaaaaaaaaaa;
local_48 = 0xaaaaaaaaaaaaaaaa;
uStack_40 = 0xaaaaaaaaaaaaaaaa;
local_58 = 0xaaaaaaaaaaaaaaaa;
uStack_50 = 0xaaaaaaaaaaaaaaaa;
builtin_memset(local_68,0xaa,0xc);
uStack_5c = 0xaaaaaaaa;
local_8 = 0xaaaaaaaa;
printf(local_68);
builtin_strncpy(local_68,"Test123Test1",0xc);
uStack_5c = CONCAT13(uStack_5c._3_1_,0x3332);
printf(local_68);
return 0;
}Even if I change the datatype of the Regardless of this issue, I think it's better to only introduce |
|
@LukeSerne, Thanks for testing this and for the detailed repro. I agree with your point. I’ve made a commit to drop builtin_bzero and emit builtin_memset(x, 0, len) for zero fills instead. That keeps the userop surface smaller and makes the output clearer. I also removed the builtin_bzero registration path, so both zero-filled and constant-filled regions now go through builtin_memset. If we later find a strong need for a separate builtin_bzero for some architecture-specific reason, we can revisit it, but the simpler model seems preferable here. |
|
As for why the zero-case was not detected, I think I found the root cause:
With this issue fixed, the example does decompile correctly: undefined8 main(void)
{
undefined4 uVar1;
char local_68 [100];
builtin_memset(local_68,0,100);
printf(local_68);
builtin_memset(local_68,0xaa,100);
printf(local_68);
uVar1 = local_68._12_4_;
builtin_strncpy(local_68,"Test123Test123",0xf);
local_68[0xf] = SUB41(uVar1,3);
printf(local_68);
return 0;
}The change I made to @@ -132,11 +132,18 @@ int4 ArraySequence::formByteArray(int4 sz,int4 slot,uint8 rootOff,bool bigEndian
int4 bigElSize = charType->getAlignSize();
int4 maxEl = used.size() / bigElSize;
int4 count;
+ bool is_all_zero = false;
for(count=0;count<maxEl;count += 1) {
uint1 val = used[ count * bigElSize ];
if (val != 1) { // Count number of characters not including null terminator
- if (val == 2)
+ if (val == 2) {
+ if (is_all_zero) continue;
+ if (count == 0) { // First byte is a null byte -> check for all-zeroes
+ is_all_zero = true;
+ continue;
+ }
count += 1; // Allow a single null terminator
+ }
break;
}
} |
|
@LukeSerne Thanks for reminding me. I'll remove the commits and update the PR title (I made a mistake when switching branches). |
61f9180 to
94da96d
Compare
|
Yeah, no problem 🙂. You should also change Also, it might be better to call |
Add decompiler built-ins for builtin_bzero and builtin_memset, which extends the constant sequence/string copy optimization pass, detecting repeated byte writes and collapse zero-filled or constant-filled memory regions (including patterns like zeroed buffers and non-printable fill bytes such as 0xAA) into the appropriate calls for improved output.
Closes #9230.