This assembly code implements optimized versions of the memcmp, memset, and memcpy functions for the ARM architecture, utilizing the ARMv7-M architecture with VFPv2 floating-point support. It's applicable for Cortex M4 and Cortex M7 MCU's. These functions are commonly used for memory comparison, setting, and copying, respectively, and the use of VFP instructions enhances performance.
-
memcmp_vfp:- Purpose: Compares two blocks of memory.
- Registers:
r0: Pointer to the first memory block.r1: Pointer to the second memory block.r2: Number of bytes to compare.
- Operation:
- If
r2(size) is zero, returns 0 indicating the blocks are equal. - Compares memory in 4-byte chunks using
ldrandsubsinstructions. - Falls back to comparing 1-byte chunks if less than 4 bytes remain.
- If any mismatch is found, returns -1; otherwise, returns 0.
- If
-
memset_vfp:- Purpose: Fills a block of memory with a specified value.
- Registers:
r0: Pointer to the memory block.r1: Value to be set.r2: Number of bytes to set.
- Operation:
- If
r2(size) is zero, terminates the function. - Prepares the
d0register with the value to be set usingbfiandvmovinstructions. - Sets memory in 32-byte chunks using
vstrinstructions. - Falls back to setting 8-byte, 4-byte, and 1-byte chunks as the size decreases.
- Returns to the caller using
bx lr.
- If
-
memcpy_vfp:- Purpose: Copies a block of memory from a source to a destination.
- Registers:
r0: Pointer to the destination memory block.r1: Pointer to the source memory block.r2: Number of bytes to copy.
- Operation:
- If
r2(size) is zero, terminates the function. - Copies memory in 128-byte chunks using
vldrandvstrinstructions. - Falls back to copying 32-byte, 8-byte, 4-byte, and 1-byte chunks as the size decreases.
- Returns the address of the destination buffer using
movandbx lr.
- If
-
memcmp_vfp:- Start: Checks if
r2is zero and jumps to_cmp_goodif true. Otherwise, compares memory in 4-byte chunks. - 4-Byte Comparison Loop (
_cmp4): Loads and compares 4 bytes from each memory block. If they are not equal, jumps to_cmp_err. Otherwise, decrementsr2and repeats until less than 4 bytes remain. - 1-Byte Comparison Loop (
_cmp1): Loads and compares 1 byte from each memory block. If they are not equal, jumps to_cmp_err. Otherwise, decrementsr2and repeats until no bytes remain. - End: If all comparisons are successful, moves 0 into
r0and returns. If any comparison fails, moves -1 intor0and returns.
- Start: Checks if
-
memset_vfp:- Start: Checks if
r2is zero and jumps tostopif true. Prepares thed0register with the value to be set. - 32-Byte Setting Loop (
set32): Sets 32 bytes of memory at a time usingvstrinstructions. Adjustsr0andr2accordingly and repeats until less than 32 bytes remain. - 8-Byte Setting Loop (
set8): Sets 8 bytes of memory at a time usingvstrinstructions. Adjustsr0andr2accordingly and repeats until less than 8 bytes remain. - 4-Byte Setting Loop (
set4): Sets 4 bytes of memory at a time usingstrinstructions. Adjustsr0andr2accordingly and repeats until less than 4 bytes remain. - 1-Byte Setting Loop (
set1): Sets 1 byte of memory at a time usingstrbinstructions. Adjustsr0andr2accordingly and repeats until no bytes remain. - End: Restores the original values of
d0andr1from the stack and returns to the caller usingbx lr.
- Start: Checks if
-
memcpy_vfp:- Start: Checks if
r2is zero and jumps tostopif true. Pushes the source pointer onto the stack and prepares thed0register. - 128-Byte Copy Loop (
copy128): Copies 128 bytes of memory at a time usingvldrandvstrinstructions. Adjustsr0,r1, andr2accordingly and repeats until less than 128 bytes remain. - 32-Byte Copy Loop (
copy32): Copies 32 bytes of memory at a time usingldrandstrinstructions. Adjustsr0,r1, andr2accordingly and repeats until less than 32 bytes remain. - 8-Byte Copy Loop (
copy8): Copies 8 bytes of memory at a time usingldrandstrinstructions. Adjustsr0,r1, andr2accordingly and repeats until less than 8 bytes remain. - 4-Byte Copy Loop (
copy4): Copies 4 bytes of memory at a time usingldrandstrinstructions. Adjustsr0,r1, andr2accordingly and repeats until less than 4 bytes remain. - 1-Byte Copy Loop (
copybytes): Copies 1 byte of memory at a time usingldrbandstrbinstructions. Adjustsr0,r1, andr2accordingly and repeats until no bytes remain. - End: Restores the original values of
d0andr1from the stack, moves the address of the destination buffer intor0, and returns to the caller usingbx lr.
- Start: Checks if
