Description
Currently, shuffle1_dyn
is a safe function, but if the index is out-of-bounds its behavior is undefined: it will panic!
in some archs, and return an unspecified lane value in others - this is a bug (allowing safe Rust to have UB).
-
We should add an
unsafe
shuffle1_dyn_unchecked
API that has undefined behavior if the index is out-of-bounds and not0x80
. -
We should change
shuffle1_dyn
topanic!
if any index is out-of-bounds and not0x80
(that is,shuffle1_dyn
checks the indices and callsshuffle1_dyn_unchecked
). -
Both APIs should set the resulting lane to
0
if the index is0x80
- we can provide a:const SHUFFLE1_DYN_ZERO: usize = 0x80;
constant for this.
See WebAssembly/simd#68 , WebAssembly/simd#24, WebAssembly/simd#71
x86_64: If bit 7 is 1, set to 0, otherwise use lower 4 bits for lookup
ARM: if index is out of range (0-15), set to 0
PPC64: use lower 4 bits for lookup (no out of range handling)
MIPS: if bit 6 or bit 7 is 1, set to 0; otherwise use lower 4 bits for lookup (or rather use lower 5 bits for lookup into a table that has 32 elements constructed from 2 input vectors, but if both vectors are the same then it effectively means bits 4,5 are ignored)
RISCV: if index is out of range (0-15), set to 0.[...] if the specification says that all out-of-range indices return 0, then ARM and RISCV will use the native instruction, x86_64/PPC64/MIPS will have to generate additional checks using a 3-instruction sequence like "compare with 15; shuffle; andnot".
we could say that if index is 0x80 we return 0, if index is 0-15 we return the index result, otherwise the resulting byte is unspecified. This seems like the cleanest way to unify x86_64 behavior with the rest, resulting in zero-cost implementation for x86_64, ARM, MIPS, RISCV, and a 3-instruction sequence (compare to 0x80, shuffle, andnot) on PPC64.