Skip to content

feat: complete VFP support — f32 pseudo-ops and f64 double-precision#47

Merged
avrabe merged 1 commit intomainfrom
feat/vfp-complete
Mar 17, 2026
Merged

feat: complete VFP support — f32 pseudo-ops and f64 double-precision#47
avrabe merged 1 commit intomainfrom
feat/vfp-complete

Conversation

@avrabe
Copy link
Contributor

@avrabe avrabe commented Mar 17, 2026

Summary

Completes VFP floating-point support (closes #42). PR #41 added basic f32 arithmetic; this adds the remaining operations.

f32 pseudo-ops (were errors, now real multi-instruction sequences)

  • f32.ceil/floor/trunc/nearest → VCVT round-trip sequences
  • f32.min/max → VCMP + VMRS + conditional VMOV
  • f32.copysign → VMOV to GP registers, bit manipulation, VMOV back

f64 double-precision (new)

  • D-register allocation (D0-D15) with cp11 coprocessor encoding
  • Full f64 arithmetic: VADD.F64, VSUB.F64, VMUL.F64, VDIV.F64
  • Unary: VABS.F64, VNEG.F64, VSQRT.F64
  • Pseudo-ops: ceil, floor, trunc, nearest, min, max, copysign
  • Load/store: VLDR.64, VSTR.64
  • Conversions: f64↔i32, f64↔f32 (promote/demote)
  • Constants and reinterpret casts
  • ISA gating: requires has_double_precision_fpu() (M7DP only)

Changes

File Delta
arm_encoder.rs +1530/-146 (D-reg helpers, f32 pseudo-ops, all f64 encoding)
instruction_selector.rs +272/-15 (D-reg alloc, f32 pseudo-ops, f64 selection)
f32_operations_test.rs Updated 4 tests
f32_vfp_encoding_test.rs Updated 2 tests

Test plan

  • 663 tests pass (+31 new, up from 632)
  • cargo clippy clean
  • cargo fmt --check clean
  • CI passes

🤖 Generated with Claude Code

Implement remaining f32 pseudo-operations (ceil, floor, trunc, nearest,
min, max, copysign) and full f64 double-precision VFP encoding for both
ARM32 and Thumb-2 modes.

f32 pseudo-ops:
- F32Ceil/Floor/Trunc/Nearest via VCVT.S32.F32 + VCVT.F32.S32 sequence
- F32Min/Max via VMOV + VCMP + VMRS + conditional VMOV
- F32Copysign via VMOV to GP regs, AND/BIC/ORR, VMOV back

f64 double-precision (D0-D15):
- Arithmetic: VADD.F64, VSUB.F64, VMUL.F64, VDIV.F64
- Unary: VABS.F64, VNEG.F64, VSQRT.F64
- Pseudo-ops: ceil, floor, trunc, nearest, min, max, copysign
- Comparisons: VCMP.F64 + VMRS + conditional MOV sequence
- Load/Store: VLDR.64, VSTR.64 (cp11 encoding)
- Constants: MOVW+MOVT (lo32) + MOVW+MOVT (hi32) + VMOV Dd, Rlo, Rhi
- Conversions: VCVT.F64.S32/U32, VCVT.F64.F32, VCVT.S32/U32.F64
- Reinterpret: VMOV Dd, Rlo, Rhi / VMOV Rlo, Rhi, Dm

Instruction selector:
- D-register allocator (D0-D15, wrapping)
- has_double_precision() gate for f64 instruction selection
- f32 pseudo-ops now generate VFP sequences instead of errors

663 tests pass (31 new), clippy clean, fmt clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@avrabe avrabe merged commit 5bd170d into main Mar 17, 2026
5 checks passed
@avrabe avrabe deleted the feat/vfp-complete branch March 17, 2026 18:28
avrabe added a commit that referenced this pull request Mar 17, 2026
PR #47 implemented f32 rounding pseudo-ops but had two bugs:
1. The rounding mode values for ceil (0b01) and floor (0b10) were swapped
2. All rounding modes used VCVTR.S32.F32 which always truncates toward
   zero, ignoring the mode parameter entirely

Fix by properly manipulating the FPSCR rounding mode bits [23:22]
for ceil/floor/nearest, using the non-R variant of VCVT.S32.F32
(bit[7]=0) which honours the FPSCR rounding mode, then restoring
FPSCR afterward. Trunc continues to use VCVTR (the R variant).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

VFP floating-point: remaining f32 pseudo-ops and f64 support

1 participant