Skip to content

fix: Hide linker-defined symbols in shared libraries to match GNU ld#1686

Open
Blazearth wants to merge 1 commit intowild-linker:mainfrom
Blazearth:fix-hide-linker-symbols-in-shared-libs
Open

fix: Hide linker-defined symbols in shared libraries to match GNU ld#1686
Blazearth wants to merge 1 commit intowild-linker:mainfrom
Blazearth:fix-hide-linker-symbols-in-shared-libs

Conversation

@Blazearth
Copy link
Contributor

Addresses #1508 (comment)

GNU ld suppresses linker-defined layout symbols (etext, edata, end, etc.) when producing shared objects. Wild previously emitted them as GLOBAL DEFAULT, causing them to appear as exported ABI symbols in .dynsym. This change hides them when linking shared libraries so they remain local in .symtab but are not exported.

Copy link
Member

@davidlattimore davidlattimore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking at this.

Could you add an integration test? I'd suggest a test that produces a shared object (not an executable). You can do that something like:

//#LinkArgs:-shared -z now
//#RunEnabled:false

You could then add some //#ExpectSym or //ExpectDynSym assertions. You might need to extend the symbol properties that can be asserted to support asserting the symbol visibility.

symbols.section_end(output_section_id::TEXT, "__etext");
// GNU ld doesn't emit these symbols in shared libraries, so we hide them
if output_kind.is_shared_object() {
symbols.section_end(output_section_id::TEXT, "etext").hide();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might read slightly nicer if we had a set_hidden function that took a bool, then we could do something like the following and avoid repeating all the symbols:

symbols.section_end(output_section_id::TEXT, "etext").set_hidden(hidden);`

@mati865
Copy link
Member

mati865 commented Mar 12, 2026

Those were symtab entries:

4d89610e445f ~/wild-proxy # readelf -Ws /tmp/x86-64-v3-gggl-lies.so.wild | grep -E 'edata|end|etext|:$'
Symbol table '.dynsym' contains 12 entries:
Symbol table '.symtab' contains 65 entries:
     3: 00000000000080f8     0 NOTYPE  LOCAL  DEFAULT   16 __init_array_end
     5: 0000000000008100     0 NOTYPE  LOCAL  DEFAULT   17 __fini_array_end
     6: 00000000000093c9     0 NOTYPE  LOCAL  DEFAULT   23 __end
    51: 00000000000070c0     0 NOTYPE  GLOBAL DEFAULT   13 etext
    52: 00000000000070c0     0 NOTYPE  GLOBAL DEFAULT   13 _etext
    53: 00000000000070c0     0 NOTYPE  GLOBAL DEFAULT   13 __etext
    54: 00000000000093c9     0 NOTYPE  GLOBAL DEFAULT   23 end
    55: 00000000000093c9     0 NOTYPE  GLOBAL DEFAULT   23 _end
    57: 00000000000093c8     0 NOTYPE  GLOBAL DEFAULT   21 edata
    58: 00000000000093c8     0 NOTYPE  GLOBAL DEFAULT   21 _edata

After some testing I see that GNU ld always outputs _edata and _end into symtab for executables but GCs them for shared objects. LLD always GCs them.
GC in this case means removal of the symbols which are not referenced by any exported symbol.

Secondly, Wild never puts them into .dynsym and with this change they will become local even when referenced.
Consider this example:

❯ bat -p a.c
extern int _edata;

int foo() {
  return _edata;
}

❯ gcc a.c -fpic -shared -o a.ld

❯ gcc a.c -fpic -shared -o a.lld -fuse-ld=lld

❯ gcc a.c -fpic -shared -o a.wild -B ~/Projects/wild/fakes-debug/

❯ readelf -Ws ./a.ld | rg 'edata$|end$|etext$|:$'
Symbol table '.dynsym' contains 7 entries:
     5: 0000000000004008     0 NOTYPE  GLOBAL DEFAULT   19 _edata
Symbol table '.symtab' contains 26 entries:
    21: 0000000000004008     0 NOTYPE  GLOBAL DEFAULT   19 _edata

❯ readelf -Ws ./a.lld | rg 'edata$|end$|etext$|:$'
Symbol table '.dynsym' contains 7 entries:
     6: 00000000000037f0     0 NOTYPE  GLOBAL DEFAULT   20 _edata
Symbol table '.symtab' contains 23 entries:
    22: 00000000000037f0     0 NOTYPE  GLOBAL DEFAULT   20 _edata

❯ readelf -Ws ./a.wild | rg 'edata$|end$|etext$|:$'
Symbol table '.dynsym' contains 6 entries:
Symbol table '.symtab' contains 33 entries:
     3: 0000000000002d18     0 NOTYPE  LOCAL  DEFAULT   14 __init_array_end
     5: 0000000000002d20     0 NOTYPE  LOCAL  DEFAULT   15 __fini_array_end
     6: 0000000000001cdf     0 NOTYPE  LOCAL  DEFAULT   11 etext
     7: 0000000000001cdf     0 NOTYPE  LOCAL  DEFAULT   11 _etext
     8: 0000000000001cdf     0 NOTYPE  LOCAL  DEFAULT   11 __etext
     9: 0000000000003f99     0 NOTYPE  LOCAL  DEFAULT   21 end
    10: 0000000000003f99     0 NOTYPE  LOCAL  DEFAULT   21 _end
    11: 0000000000003f99     0 NOTYPE  LOCAL  DEFAULT   21 __end
    12: 0000000000003f98     0 NOTYPE  LOCAL  DEFAULT   19 edata
    13: 0000000000003f98     0 NOTYPE  LOCAL  DEFAULT   19 _edata

So, we might want to settle on least bad workaround for now, and rework the concept later.

@Blazearth
Copy link
Contributor Author

Fair enough i will add an intergration test that links a shared object and see linker defined symbols dont appear in .dynsym unless they are referenced and i will also look into extending the symbol assertions to support checking visibility if needed.
@mati865
Thanks for the detailed explanation and example.
You're right — with the current change these symbols become LOCAL in shared objects even if referenced, which differs from GNU ld / LLD. For now this mainly avoids leaking unused layout symbols into the exported ABI (which caused the babl failure). A more complete fix would likely involve proper GC of these linker-defined symbols.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants