-
Notifications
You must be signed in to change notification settings - Fork 211
[aaelf64] Clarify relocation optimization [issue #328] #352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Make it clearer that a GOT sequence can only be optimized if all GOT relocations to the symbol are part of a sequence.
|
Can I check what you mean by I think you mean that in each of these three pairs of GOT relocations to |
|
No - as described, all must be part of a sequence in order to be optimizable since that is proof they are independent. However not all need to be optimized (eg. if out of range). So in your example only the 3rd ADRP is not a valid sequence and thus blocks the optimization for the other 2. An alternative would be to optimize all or nothing without considering pairs. But then a single ADRP that ends up out of range would block all optimizations, making it unsuitable for the medium/large model. Also keeping the ADRP/LDR as pairs results in better code quality. |
|
I think the problem observed llvm/llvm-project#138254 could still happen with valid sequences, with at least one out of range. I agree it would be vanishingly unlikely and would be more difficult to check for. I'd like to have a think about the wording and will make some suggestions. Hopefully tomorrow. |
|
No it can't happen with only valid sequences since you cannot ever leak the value of the ADRP since the LDR in the next instruction overwrites it. Thus it is impossible to branch into the middle of a sequence. |
|
OK, I see that in the problem example the destination registers in the ADRP and LDR are different so this would be an invalid sequence even if the relocations were consecutive. This does mean that it is insufficient to just look at the relocations. For example if I hand edit the LLVM example then all the relocations to I guess it means we'll need to be clear about what we mean by |
|
The basic conditions explained above obviously still apply:
I thought the specification was clear enough... Perhaps we need to explain the optimization algorithm in pseudo code step-by-step? |
smithp35
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading through again. I think we've got a set of "base" or "common" conditions on line 1425 for all sequences. Then there's a "specific" conditions for each individual sequence. For the Large GOT indirection, starting at line 1431.
From your comment above about one of the sequences being out of range being OK, then I think we'll need to make sure that "valid" sequence doesn't include symbol is within range of the R_<CLS>_ADR_PREL_PG_HI21 relocation.
I think some examples will help.
| - The relocations do not appear separately or in a different order. | ||
|
|
||
| In this case each set of relocations is independent and may be optimized. The following sequences are defined: | ||
| The following sequences are defined: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 1425 above (can't put the comment on the line). I suggest we insert a "base" before conditions
"if all the following base conditions are true."
| - ``symbol`` does not have a ``st_shndx`` of ``SHN_ABS`` or the output is not required to be position independent. | ||
| - ``symbol`` is within range of the ``R_<CLS>_ADR_PREL_PG_HI21`` relocation. | ||
| - The addend of both relocations is zero. | ||
| - All ``R_<CLS>_ADR_GOT_PAGE`` and ``R_<CLS>_LD64_GOT_LO12_NC`` relocations to ``symbol`` are part of a sequence. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be combined with the line below and removed from the specific conditions. For example
If the R_<CLS>_ADR_GOT_PAGE and R_<CLS>_LD64_GOT_LO12_NC relocations to symbol are used outside a sequence that satisfies the base conditions then the Large GOT indirection optimization is not legal for symbol.
Examples that prevent the Large GOT indirection optimization for symbol:
::
// Different destination register
ADRP x0, :got: symbol // R_<CLS>_ADR_GOT_PAGE
LDR x1, [x0 :got_lo12: symbol] // R_<CLS>_LD64_GOT_LO12_NC
// Instructions not in sequence.
ADRP x0, :got: symbol // R_<CLS>_ADR_GOT_PAGE
NOP
LDR x0, [x0 :got_lo12: symbol] // R_<CLS>_LD64_GOT_LO12_NC
A linker may avoid creating a GOT entry if no other GOT relocations exist for the symbol.
Make it clearer that a GOT sequence can only be optimized if all GOT relocations to the symbol are part of a sequence.