Skip to content

Commit 08927ba

Browse files
authored
Cranelift: add a "patchable call" ABI. (#12061)
This ABI is intended for use in scenarios where we want a very lightweight callsite that can be turned on and off by patching in one instruction. (The actual patchable call instruction is not in this PR; that will be a separate PR.) The idea is that we define a call to clobber *no* registers -- not even the arguments! And we restrict signatures such that on all of our supported architectures, all arguments go into registers only. Those two requirements together mean that all callsites for this ABI should have only a raw call instruction, with no loads/stores to stackslots; and have the minimum possible impact on regalloc, by only imposing constraints on args to ensure they are in certain registers but not altering those registers. Given this, we could implement, e.g., breakpoints with patchable callsites (off by default) at every sequence point in compiled code. In a typical use-case with Wasmtime-compiled Wasm, that would put a bunch of uses of vmctx constrained to the first argument register in every code path, but vmctx likely already sits there most of the time anyway (for any call to other Wasm functions or for libcalls). Thus, the impact is just the one instruction and nothing else. This PR adds the calling convention itself and tests that show that *two* consecutive callsites can be compiled with no register setup re-occurring from one call to the next (thus demonstrating no clobbers).
1 parent bd31846 commit 08927ba

File tree

15 files changed

+1881
-33
lines changed

15 files changed

+1881
-33
lines changed

cranelift/codegen/src/isa/aarch64/abi.rs

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1102,6 +1102,7 @@ impl ABIMachineSpec for AArch64MachineDeps {
11021102
(isa::CallConv::Winch, true) => ALL_CLOBBERS,
11031103
(isa::CallConv::Winch, false) => WINCH_CLOBBERS,
11041104
(isa::CallConv::SystemV, _) => DEFAULT_AAPCS_CLOBBERS,
1105+
(isa::CallConv::Patchable, _) => NO_CLOBBERS,
11051106
(_, false) => DEFAULT_AAPCS_CLOBBERS,
11061107
(_, true) => panic!("unimplemented clobbers for exn abi of {call_conv:?}"),
11071108
}
@@ -1265,11 +1266,15 @@ impl AArch64MachineDeps {
12651266
/// Is the given register saved in the prologue if clobbered, i.e., is it a
12661267
/// callee-save?
12671268
fn is_reg_saved_in_prologue(
1268-
_call_conv: isa::CallConv,
1269+
call_conv: isa::CallConv,
12691270
enable_pinned_reg: bool,
12701271
sig: &Signature,
12711272
r: RealReg,
12721273
) -> bool {
1274+
if call_conv == isa::CallConv::Patchable {
1275+
return true;
1276+
}
1277+
12731278
// FIXME: We need to inspect whether a function is returning Z or P regs too.
12741279
let save_z_regs = sig
12751280
.params
@@ -1518,6 +1523,7 @@ const fn all_clobbers() -> PRegSet {
15181523
const DEFAULT_AAPCS_CLOBBERS: PRegSet = default_aapcs_clobbers();
15191524
const WINCH_CLOBBERS: PRegSet = winch_clobbers();
15201525
const ALL_CLOBBERS: PRegSet = all_clobbers();
1526+
const NO_CLOBBERS: PRegSet = PRegSet::empty();
15211527

15221528
fn create_reg_env(enable_pinned_reg: bool) -> MachineEnv {
15231529
fn preg(r: Reg) -> PReg {

cranelift/codegen/src/isa/call_conv.rs

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,26 @@ pub enum CallConv {
4545
/// defines no callee-save registers, and restricts the number of return
4646
/// registers to one integer, and one floating point.
4747
Winch,
48+
/// Calling convention for patchable-call instructions.
49+
///
50+
/// This is designed for a very specific need: we want a *single*
51+
/// call instruction at our callsite, with no other setup, and we
52+
/// don't want any registers clobbered. This allows patchable
53+
/// callsites to be as unobtrusive as possible.
54+
///
55+
/// The ABI is based on the native register-argument ABI on each
56+
/// respective platform, but puts severe restrictions on allowable
57+
/// signatures: only up to four arguments of integer type, and no
58+
/// return values. It does not support tail-calls, and disallows
59+
/// any extension modes on arguments.
60+
///
61+
/// The ABI specifies that *no* registers, not even argument
62+
/// registers, are clobbered. This is pretty unique: it means that
63+
/// the call instruction will constrain regalloc to have any args
64+
/// in the right registers, but those registers will be preserved,
65+
/// so multiple patchable callsites can reuse those values. This
66+
/// further reduces the cost of the callsites.
67+
Patchable,
4868
}
4969

5070
impl CallConv {
@@ -123,6 +143,7 @@ impl fmt::Display for CallConv {
123143
Self::AppleAarch64 => "apple_aarch64",
124144
Self::Probestack => "probestack",
125145
Self::Winch => "winch",
146+
Self::Patchable => "patchable",
126147
})
127148
}
128149
}
@@ -139,6 +160,7 @@ impl str::FromStr for CallConv {
139160
"apple_aarch64" => Ok(Self::AppleAarch64),
140161
"probestack" => Ok(Self::Probestack),
141162
"winch" => Ok(Self::Winch),
163+
"patchable" => Ok(Self::Patchable),
142164
_ => Err(()),
143165
}
144166
}

cranelift/codegen/src/isa/pulley_shared/abi.rs

Lines changed: 19 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -480,18 +480,20 @@ where
480480
}
481481

482482
fn get_regs_clobbered_by_call(
483-
_call_conv_of_callee: isa::CallConv,
483+
call_conv_of_callee: isa::CallConv,
484484
is_exception: bool,
485485
) -> PRegSet {
486-
if is_exception {
486+
if call_conv_of_callee == isa::CallConv::Patchable {
487+
NO_CLOBBERS
488+
} else if is_exception {
487489
ALL_CLOBBERS
488490
} else {
489491
DEFAULT_CLOBBERS
490492
}
491493
}
492494

493495
fn compute_frame_layout(
494-
_call_conv: isa::CallConv,
496+
call_conv: isa::CallConv,
495497
flags: &settings::Flags,
496498
_sig: &Signature,
497499
regs: &[Writable<RealReg>],
@@ -502,11 +504,12 @@ where
502504
fixed_frame_storage_size: u32,
503505
outgoing_args_size: u32,
504506
) -> FrameLayout {
505-
let mut regs: Vec<Writable<RealReg>> = regs
506-
.iter()
507-
.cloned()
508-
.filter(|r| DEFAULT_CALLEE_SAVES.contains(r.to_reg().into()))
509-
.collect();
507+
let is_callee_save = |reg: &Writable<RealReg>| match call_conv {
508+
isa::CallConv::Patchable => true,
509+
_ => DEFAULT_CALLEE_SAVES.contains(reg.to_reg().into()),
510+
};
511+
let mut regs: Vec<Writable<RealReg>> =
512+
regs.iter().cloned().filter(is_callee_save).collect();
510513

511514
regs.sort_unstable();
512515

@@ -720,7 +723,7 @@ impl FrameLayout {
720723
I64
721724
}
722725
RegClass::Float => F64,
723-
RegClass::Vector => unreachable!("no vector registers are callee-save"),
726+
RegClass::Vector => I8X16,
724727
};
725728
let offset = i32::try_from(offset).unwrap();
726729
Some((offset, ty, Reg::from(reg.to_reg())))
@@ -759,7 +762,11 @@ fn compute_clobber_size(clobbers: &[Writable<RealReg>]) -> u32 {
759762
RegClass::Float => {
760763
clobbered_size += 8;
761764
}
762-
RegClass::Vector => unimplemented!("Vector Size Clobbered"),
765+
RegClass::Vector => {
766+
// No alignment concerns: the Pulley virtual CPU
767+
// supports unaligned vector load/stores.
768+
clobbered_size += 16;
769+
}
763770
}
764771
}
765772
align_to(clobbered_size, 16)
@@ -948,6 +955,8 @@ const ALL_CLOBBERS: PRegSet = PRegSet::empty()
948955
.with(pv_reg(30))
949956
.with(pv_reg(31));
950957

958+
const NO_CLOBBERS: PRegSet = PRegSet::empty();
959+
951960
fn create_reg_environment() -> MachineEnv {
952961
// Prefer caller-saved registers over callee-saved registers, because that
953962
// way we don't need to emit code to save and restore them if we don't

cranelift/codegen/src/isa/riscv64/abi.rs

Lines changed: 26 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -489,16 +489,17 @@ impl ABIMachineSpec for Riscv64MachineDeps {
489489
if stack_size > 0 {
490490
insts.extend(Self::gen_sp_reg_adjust(-(stack_size as i32)));
491491

492-
let mut cur_offset = 8;
492+
let mut cur_offset = 0;
493493
for reg in &frame_layout.clobbered_callee_saves {
494494
let r_reg = reg.to_reg();
495495
let ty = match r_reg.class() {
496496
RegClass::Int => I64,
497497
RegClass::Float => F64,
498-
RegClass::Vector => unimplemented!("Vector Clobber Saves"),
498+
RegClass::Vector => I8X16,
499499
};
500+
cur_offset = align_to(cur_offset, ty.bytes());
500501
insts.push(Inst::gen_store(
501-
AMode::SPOffset((stack_size - cur_offset) as i64),
502+
AMode::SPOffset(i64::from(stack_size - cur_offset - ty.bytes())),
502503
Reg::from(reg.to_reg()),
503504
ty,
504505
MemFlags::trusted(),
@@ -507,13 +508,14 @@ impl ABIMachineSpec for Riscv64MachineDeps {
507508
if flags.unwind_info() {
508509
insts.push(Inst::Unwind {
509510
inst: UnwindInst::SaveReg {
510-
clobber_offset: frame_layout.clobber_size - cur_offset,
511+
clobber_offset: frame_layout.clobber_size - cur_offset - ty.bytes(),
511512
reg: r_reg,
512513
},
513514
});
514515
}
515516

516-
cur_offset += 8
517+
cur_offset += ty.bytes();
518+
assert!(cur_offset <= stack_size);
517519
}
518520
}
519521
insts
@@ -529,22 +531,23 @@ impl ABIMachineSpec for Riscv64MachineDeps {
529531
let stack_size = frame_layout.clobber_size
530532
+ frame_layout.fixed_frame_storage_size
531533
+ frame_layout.outgoing_args_size;
534+
let mut cur_offset = 0;
532535

533-
let mut cur_offset = 8;
534536
for reg in &frame_layout.clobbered_callee_saves {
535537
let rreg = reg.to_reg();
536538
let ty = match rreg.class() {
537539
RegClass::Int => I64,
538540
RegClass::Float => F64,
539-
RegClass::Vector => unimplemented!("Vector Clobber Restores"),
541+
RegClass::Vector => I8X16,
540542
};
543+
cur_offset = align_to(cur_offset, ty.bytes());
541544
insts.push(Inst::gen_load(
542545
reg.map(Reg::from),
543-
AMode::SPOffset(i64::from(stack_size - cur_offset)),
546+
AMode::SPOffset(i64::from(stack_size - cur_offset - ty.bytes())),
544547
ty,
545548
MemFlags::trusted(),
546549
));
547-
cur_offset += 8
550+
cur_offset += ty.bytes();
548551
}
549552

550553
if stack_size > 0 {
@@ -619,12 +622,13 @@ impl ABIMachineSpec for Riscv64MachineDeps {
619622
) -> PRegSet {
620623
match call_conv_of_callee {
621624
isa::CallConv::Tail if is_exception => ALL_CLOBBERS,
625+
isa::CallConv::Patchable => NO_CLOBBERS,
622626
_ => DEFAULT_CLOBBERS,
623627
}
624628
}
625629

626630
fn compute_frame_layout(
627-
_call_conv: isa::CallConv,
631+
call_conv: isa::CallConv,
628632
flags: &settings::Flags,
629633
_sig: &Signature,
630634
regs: &[Writable<RealReg>],
@@ -635,11 +639,12 @@ impl ABIMachineSpec for Riscv64MachineDeps {
635639
fixed_frame_storage_size: u32,
636640
outgoing_args_size: u32,
637641
) -> FrameLayout {
638-
let mut regs: Vec<Writable<RealReg>> = regs
639-
.iter()
640-
.cloned()
641-
.filter(|r| DEFAULT_CALLEE_SAVES.contains(r.to_reg().into()))
642-
.collect();
642+
let is_callee_saved = |reg: &Writable<RealReg>| match call_conv {
643+
isa::CallConv::Patchable => true,
644+
_ => DEFAULT_CALLEE_SAVES.contains(reg.to_reg().into()),
645+
};
646+
let mut regs: Vec<Writable<RealReg>> =
647+
regs.iter().cloned().filter(is_callee_saved).collect();
643648

644649
regs.sort_unstable();
645650

@@ -760,7 +765,10 @@ fn compute_clobber_size(clobbers: &[Writable<RealReg>]) -> u32 {
760765
RegClass::Float => {
761766
clobbered_size += 8;
762767
}
763-
RegClass::Vector => unimplemented!("Vector Size Clobbered"),
768+
RegClass::Vector => {
769+
clobbered_size = align_to(clobbered_size, 16);
770+
clobbered_size += 16;
771+
}
764772
}
765773
}
766774
align_to(clobbered_size, 16)
@@ -937,6 +945,8 @@ const ALL_CLOBBERS: PRegSet = PRegSet::empty()
937945
.with(pv_reg(30))
938946
.with(pv_reg(31));
939947

948+
const NO_CLOBBERS: PRegSet = PRegSet::empty();
949+
940950
fn create_reg_environment() -> MachineEnv {
941951
// Some C Extension instructions can only use a subset of the registers.
942952
// x8 - x15, f8 - f15, v8 - v15 so we should prefer to use those since

cranelift/codegen/src/isa/s390x/abi.rs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -880,6 +880,7 @@ impl ABIMachineSpec for S390xMachineDeps {
880880
match call_conv_of_callee {
881881
isa::CallConv::Tail if is_exception => ALL_CLOBBERS,
882882
isa::CallConv::Tail => TAIL_CLOBBERS,
883+
isa::CallConv::Patchable => NO_CLOBBERS,
883884
_ => SYSV_CLOBBERS,
884885
}
885886
}
@@ -1103,6 +1104,7 @@ fn is_reg_saved_in_prologue(call_conv: isa::CallConv, r: RealReg) -> bool {
11031104
// r8 - r14 inclusive are callee-saves.
11041105
r.hw_enc() >= 8 && r.hw_enc() <= 14
11051106
}
1107+
(isa::CallConv::Patchable, _) => true,
11061108
(_, RegClass::Int) => {
11071109
// r6 - r15 inclusive are callee-saves.
11081110
r.hw_enc() >= 6 && r.hw_enc() <= 15
@@ -1380,8 +1382,11 @@ const fn all_clobbers() -> PRegSet {
13801382
.with(vr_preg(30))
13811383
.with(vr_preg(31))
13821384
}
1385+
13831386
const ALL_CLOBBERS: PRegSet = all_clobbers();
13841387

1388+
const NO_CLOBBERS: PRegSet = PRegSet::empty();
1389+
13851390
fn sysv_create_machine_env() -> MachineEnv {
13861391
MachineEnv {
13871392
preferred_regs_by_class: [

cranelift/codegen/src/isa/x64/abi.rs

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -891,6 +891,7 @@ impl ABIMachineSpec for X64ABIMachineSpec {
891891
(isa::CallConv::Winch, _) => ALL_CLOBBERS,
892892
(isa::CallConv::SystemV, _) => SYSV_CLOBBERS,
893893
(isa::CallConv::WindowsFastcall, false) => WINDOWS_CLOBBERS,
894+
(isa::CallConv::Patchable, _) => NO_CLOBBERS,
894895
(_, false) => SYSV_CLOBBERS,
895896
(call_conv, true) => panic!("unimplemented clobbers for exn abi of {call_conv:?}"),
896897
}
@@ -931,6 +932,8 @@ impl ABIMachineSpec for X64ABIMachineSpec {
931932
.cloned()
932933
.filter(|r| is_callee_save_fastcall(r.to_reg(), flags.enable_pinned_reg()))
933934
.collect(),
935+
// The `patchable` calling convention makes every reg a callee-save reg.
936+
CallConv::Patchable => regs.iter().cloned().collect(),
934937
CallConv::Probestack => todo!("probestack?"),
935938
CallConv::AppleAarch64 => unreachable!(),
936939
};
@@ -1085,6 +1088,8 @@ fn get_intreg_for_retval(
10851088
},
10861089

10871090
CallConv::Winch => is_last.then(|| regs::rax()),
1091+
// The patchable ABI does not support any return values.
1092+
CallConv::Patchable => None,
10881093
CallConv::Probestack => todo!(),
10891094
CallConv::AppleAarch64 => unreachable!(),
10901095
}
@@ -1113,6 +1118,8 @@ fn get_fltreg_for_retval(call_conv: CallConv, fltreg_idx: usize, is_last: bool)
11131118
_ => None,
11141119
},
11151120
CallConv::Winch => is_last.then(|| regs::xmm0()),
1121+
// The patchable ABI does not support any return values.
1122+
CallConv::Patchable => None,
11161123
CallConv::Probestack => todo!(),
11171124
CallConv::AppleAarch64 => unreachable!(),
11181125
}
@@ -1174,6 +1181,7 @@ fn compute_clobber_size(clobbers: &[Writable<RealReg>]) -> u32 {
11741181
const WINDOWS_CLOBBERS: PRegSet = windows_clobbers();
11751182
const SYSV_CLOBBERS: PRegSet = sysv_clobbers();
11761183
pub(crate) const ALL_CLOBBERS: PRegSet = all_clobbers();
1184+
const NO_CLOBBERS: PRegSet = PRegSet::empty();
11771185

11781186
const fn windows_clobbers() -> PRegSet {
11791187
use asm::gpr::enc::*;

cranelift/codegen/src/machinst/abi.rs

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1222,7 +1222,8 @@ impl<M: ABIMachineSpec> Callee<M> {
12221222
|| call_conv == isa::CallConv::Cold
12231223
|| call_conv == isa::CallConv::WindowsFastcall
12241224
|| call_conv == isa::CallConv::AppleAarch64
1225-
|| call_conv == isa::CallConv::Winch,
1225+
|| call_conv == isa::CallConv::Winch
1226+
|| call_conv == isa::CallConv::Patchable,
12261227
"Unsupported calling convention: {call_conv:?}"
12271228
);
12281229

@@ -2490,10 +2491,14 @@ impl<T> CallInfo<T> {
24902491
}
24912492

24922493
let temp = M::retval_temp_reg(self.callee_conv);
2493-
// The temporary must be noted as clobbered.
2494+
// The temporary must be noted as clobbered unless there are
2495+
// no returns (hence it isn't needed). The latter can only be
2496+
// the case statically for an ABI when the ABI doesn't allow
2497+
// any returns at all (e.g., patchable-call ABI).
24942498
debug_assert!(
2495-
M::get_regs_clobbered_by_call(self.callee_conv, self.try_call_info.is_some())
2496-
.contains(PReg::from(temp.to_reg().to_real_reg().unwrap()))
2499+
self.defs.is_empty()
2500+
|| M::get_regs_clobbered_by_call(self.callee_conv, self.try_call_info.is_some())
2501+
.contains(PReg::from(temp.to_reg().to_real_reg().unwrap()))
24972502
);
24982503

24992504
for CallRetPair { vreg, location } in &self.defs {

0 commit comments

Comments
 (0)