2525.. _SYSVABI : https://github.com/ARM-software/abi-aa/releases
2626.. _ELFTLS : https://www.uclibc.org/docs/tls.pdf
2727.. _TLSDESC : http://www.fsfla.org/~lxoliva/writeups/TLS/RFC-TLSDESC-ARM.txt
28+ .. _TLSDESCRES : https://github.com/ARM-software/abi-aa/tree/main/design-documents/tlsdesc-resolvers.txt
2829
2930.. role :: c(code)
3031 :language: c
@@ -265,6 +266,8 @@ This document refers to, or is referred to by, the following documents.
265266 +-----------------+--------------------------------------------------------------+-----------------------------------------------------------------------------+
266267 | SYM-VER _ | http://people.redhat.com/drepper/symbol-versioning | GNU Symbol Versioning |
267268 +-----------------+--------------------------------------------------------------+-----------------------------------------------------------------------------+
269+ | TLSDESCRES _ | design-documents/tlsdesc-resolvers | TLSDESC resolver function examples |
270+ +-----------------+--------------------------------------------------------------+-----------------------------------------------------------------------------+
268271
269272Terms and Abbreviations
270273-----------------------
@@ -2268,8 +2271,12 @@ thread pointer and places it in a GOT entry. The GOT entry is
22682271relocated by dynamic relocation ``R_AARCH64_TLS_TPREL64 ``.
22692272
22702273A shared-library that contains Initial Exec TLS must have the
2271- ``DF_STATIC_TLS `` dynamic tag set. An attempt to load a shared library
2272- with ``DF_STATIC_TLS `` via ``dlopen `` will be rejected.
2274+ ``DF_STATIC_TLS `` dynamic tag set. In the general case an attempt to
2275+ load a shared library with ``DF_STATIC_TLS `` via ``dlopen `` will be
2276+ rejected. Some dynamic loaders implement a surplus of DTV slots that
2277+ permit a fixed number of ``DF_STATIC_TLS `` modules to be dynamically
2278+ loaded. Whether a DTV surplus is available and how many slots are
2279+ available is implementation defined.
22732280
22742281Small Code model;
22752282
@@ -2430,7 +2437,7 @@ GOT entry, and the argument for the chosen resolver function in the
24302437second GOT entry.
24312438
24322439The AArch64 C and assembler examples are adapted from the AArch32
2433- TLSDESC _ paper. The C code below represents the TLS Descriptor.
2440+ ` TLSDESC `_ paper. The C code below represents the TLS Descriptor.
24342441
24352442.. code-block :: c
24362443
@@ -2452,6 +2459,9 @@ The TLS resolver functions are not standardized by this ABI as they
24522459are internal to the dynamic linker. Programs must not directly refer
24532460to TLS resolver functions.
24542461
2462+ The `TLSDESCRES `_ document contains information on how a platform
2463+ might implement the resolver functions.
2464+
24552465Calling Convention
24562466^^^^^^^^^^^^^^^^^^
24572467
@@ -2467,117 +2477,6 @@ TLS resolver functions are not required to save any register added by
24672477an extension, such as the scalable vector registers or the SVE
24682478predicate registers. See `GCCML `_ for details.
24692479
2470- Example Resolver Functions
2471- ^^^^^^^^^^^^^^^^^^^^^^^^^^
2472-
2473- These examples are for illustrative purposes only. There is no
2474- requirement for any of the following resolver functions to be
2475- implemented. Due to the restrictions on calling convention, the
2476- resolver routines must be written in assembly language.
2477-
2478- Static TLS Specialization:
2479-
2480- When the TLS variable is in the static TLS block, the offset from the
2481- thread pointer is fixed at runtime. The dynamic loader can calculate
2482- the offset and place it in the TLS descriptor. All the static TLS
2483- resolver function needs to do is extract the offset and return it.
2484-
2485- .. code-block :: asm
2486-
2487- _dl_tlsdesc_return:
2488- // x0 contains pointer to struct tlsdesc.
2489- // tlsdesc.argument.value contains offset of variable from TP
2490- ldr x0, [x0, #8]
2491- ret
2492-
2493- Dynamic TLS Specialization:
2494-
2495- When the TLS variable is defined in dynamic TLS the address of the TLS
2496- variable must be calculated by the resolver function using
2497- ``__tls_get_addr ``. The resolver function returns the offset from the
2498- thread pointer by subtracting the address of the thread pointer from
2499- the address of the TLS variable. In practice an implementation of the
2500- dynamic TLS resolver contains many platform specific details outside
2501- of the scope of the ABI. An example of how a dynamic resolver might be
2502- implemented can be found in the Dynamic Specialization section of
2503- TLSDESC _.
2504-
2505- Undefined Weak Symbols
2506-
2507- An undefined weak symbol has the value 0. As the resolver function
2508- returns an offset from the Thread Pointer, to get a value of 0 when
2509- added to the Thread Pointer the resolver function returns a negative
2510- thread pointer value that cancels to 0 when added to the thread
2511- pointer.
2512-
2513- .. code-block :: asm
2514-
2515- __dl_tlsdesc_undefweak:
2516- mrs x0, tpidr_el0
2517- neg x0, x0
2518- ret
2519-
2520- Lazy resolution of R_AARCH64_TLSDESC
2521- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2522-
2523- The TLSDESC _ paper describes an optional mechanism to resolve TLSDESC
2524- calls lazily. Lazy resolution for TLSDESC resolver functions is not
2525- recommended on AArch64. Additional synchronization is required for
2526- each TLSDESC call, which has a significant affect on performance. The
2527- description below describes the additional synchronization that is
2528- needed.
2529-
2530- Instead of fully resolving the ``R_AARCH64_TLSDESC `` relocation at
2531- module load time, a lazy resolver function runs on the first TLSDESC
2532- call. The lazy resolver updates the TLS Descriptor with the actual
2533- resolver function and the parameter to the actual resolver
2534- function. In a multi-threaded program when lazy TLS in use, the
2535- resolver functions must ensure that the write to the parameter in the
2536- TLS descriptor has completed before reading it.
2537-
2538- .. code-block :: asm
2539-
2540- // Code to obtain the offset of var from thread pointer.
2541- // Loads the address of the resolver function into x1.
2542- // Places the address of the TLS Descriptor into x0.
2543- adrp x0, :tlsdesc:var
2544- ldr x1, [x0, #:tlsdesc_lo12:var]
2545- add x0, x0, #:tlsdesc_lo12:var]
2546- .tlsdesccall var
2547- blr x1 // _dl_desc_return
2548-
2549- // Resolver function
2550- _dl_tlsdesc_return:
2551- // load the parameter from the TLS descriptor. Without
2552- // synchronization this load can read an old value prior
2553- // to the lazy resolvers update to the descriptor completing.
2554- ldr x0, [x0, #8]
2555- ret
2556-
2557- The recommended way to ensure synchronization between the lazy
2558- resolver update of the TLS Descriptor and the actual resolver function
2559- accessing the TLS Descriptor is:
2560-
2561- * The TLS lazy resolver function uses a store release when updating
2562- the address of the resolver function in the TLS Descriptor.
2563-
2564- * The actual entry function uses a load acquire on the address of the
2565- resolver function, with a destination register of xzr.
2566-
2567- Referring to the example above, the code for the resolver function
2568- becomes:
2569-
2570- .. code-block :: asm
2571-
2572- // Resolver function
2573- _dl_tlsdesc_return:
2574- // Guaranteed to complete after the lazy resolvers store release
2575- // of the address in [x0].
2576- ldar xzr, [x0]
2577- // Access the parameter.
2578- ldr x0, [x0, #8]
2579- ret
2580-
25812480Libraries
25822481=========
25832482
0 commit comments