Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 69 additions & 0 deletions main/acle.md
Original file line number Diff line number Diff line change
Expand Up @@ -464,6 +464,7 @@ Armv8.4-A [[ARMARMv84]](#ARMARMv84). Support is added for the Dot Product intrin

* Added feature test macro for FEAT_SSVE_FEXPA.
* Added feature test macro for FEAT_CSSC.
* Added support for producer-consumer data placement hints.

### References

Expand Down Expand Up @@ -1826,6 +1827,13 @@ The `__ARM_FEATURE_SYSREG128` macro can only be implemented in the AArch64
execution state. Intrinsics for the use of these instructions are specified in
[Special register intrinsics](#special-register-intrinsics).

### Producer-consumer data placement hints

`__ARM_FEATURE_PCDPHINT` is defined to `1` if there is hardware

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't __ARM_FEATURE_PCDPHINT defined (to 1) if there is compiler support? Hardware support is something else that can only be detected at runtime.

support for the producer-consumer data placement hints (FEAT_PCDPHINT)
instructions and if their associated intrinsics are
available.

## Floating-point and vector hardware

### Hardware floating point
Expand Down Expand Up @@ -2604,6 +2612,7 @@ be found in [[BA]](#BA).
| [`__ARM_FEATURE_PAC_DEFAULT`](#pointer-authentication) | Pointer authentication protection | 0x5 |
| [`__ARM_FEATURE_PAUTH`](#pointer-authentication) | Pointer Authentication Extension (FEAT_PAuth) | 1 |
| [`__ARM_FEATURE_PAUTH_LR`](#pointer-authentication) | Armv9.5-A Enhancements to Pointer Authentication Extension (FEAT_PAuth_LR) | 1 |
| [`__ARM_FEATURE_PCDPHINT`](#producer-consumer-data-placement-hints) | Producer-consumer data placement hint instructions (FEAT_PCDPHINT) | 1 |
| [`__ARM_FEATURE_QBIT`](#q-saturation-flag) | Q (saturation) flag (32-bit-only) | 1 |
| [`__ARM_FEATURE_QRDMX`](#rounding-doubling-multiplies) | SQRDMLxH instructions and associated intrinsics availability | 1 |
| [`__ARM_FEATURE_RCPC`](#rcpc) | Release Consistent processor consistent Model (64-bit-only) | 1 |
Expand Down Expand Up @@ -3574,6 +3583,16 @@ as in `__pldx`.
`__pldx` and `__plix` arguments cache level and retention policy
are ignored on unsupported targets.

### Intent to read prefetch

``` c
void __ir(void const volatile *addr);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about naming it __pldir since it's a variant of data prefetch?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it, my only concern would be around lack of consistency with the spec around the instruction.(https://developer.arm.com/documentation/ddi0602/2025-06/Base-Instructions/PRFM--immediate---Prefetch-memory--immediate--?lang=en)
Don't know how much that matters, if we prefer the pldir name for ease of understanding in the compiler?

```
Generates an intent to read on update prefetch instruction. The argument should
be any expression that may designate a data address. This intrinsic does
not require specification of cache level or retention policy. Support for this
intrinsic is indicated by `__ARM_FEATURE_PCDPHINT`.

## NOP

``` c
Expand Down Expand Up @@ -4746,6 +4765,56 @@ stored to memory is modified by replacing the low 32 bits of
`value.val[0]` with the contents of the `ACCDATA_EL1` system register.
The returned value is the same as for `__arm_st64bv`.

## Atomic store with PCDPHINT intrinsics

These intrinsics provide an atomic store intrinsic, which will
make use of the `STSHH` hint instruction immediately followed by the
associated store instruction.
This intrinsic is available when `__ARM_FEATURE_PCDPHINT` is defined.

To access these intrinsics, `<arm_acle.h>` should be included.

``` c
void __arm_stshh_u8(void const volatile *addr,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The target of a store shouldn't be const qualified? And why is it volatile qualified?
And why not use the proper data type for the target pointer?
GCC atomic builtins use plain int for the memorder parameter. See https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
void __arm_stshh_u8(uint8_t *addr, uint8_t data, int memorder, int retpol);

uint8_t data,
unsigned int mem, /* Memory order */
unsigned int ret); /* Retention Policy */
```

``` c
void __arm_stshh_u16(void const volatile *addr,
uint16_t data,
unsigned int mem, /* Memory order */
unsigned int ret); /* Retention Policy */
```

``` c
void __arm_stshh_u32(void const volatile *addr,
uint32_t data,
unsigned int mem, /* Memory order */
unsigned int ret); /* Retention Policy */
```

``` c
void __arm_stshh_u64(void const volatile *addr,
uint64_t data,
unsigned int mem, /* Memory order */
unsigned int ret); /* Retention Policy */
```
The first argument in these intrinsics is a pointer `addr` containing an address.
The second argument `data` is the data which is to be stored.
The 3rd and 4th arguments can contain the following values.

| **Memory Order** | **Value** | **Summary** |
| ------------------| --------- | -------------------------------------------------------------------------------------------------- |
| Relaxed | 0 | No constraints imposed on other reads or writes, only this operation's atomicity is guaranteed. |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if the ACLE has some precedent here but otherwise, I would expect us to use GCC's __ATOMIC_SEQ_CST, __ATOMIC_RELEASE and __ATOMIC_RELAXED names and values. But perhaps the actual values are not stable and thus not suitable for ACLE?

| Release | 1 | No reads or writes in the current thread can be reordered after this store. |

| **Retention Policy** | **Value** | **Summary** |
| -------------------- | --------- | -------------------------------------------------------------------------------- |
| KEEP | 0 | Signals to retain the updated location in the local cache of the updating PE. |
| STRM | 1 | Signals to not retain the updated location in the local cache of the updating PE. |

# Custom Datapath Extension

The intrinsics in this section provide access to instructions in the
Expand Down