JIT/AOT to interpreter calls support #116353

janvorli · 2025-06-05T22:33:22Z

We only had support for calling interpreted void returning method with no arguments by the JIT/AOT. This change enables support for passing all possible kinds of arguments and returning all types.

It reuses the CallStubGenerator that was implemented for the interpreter to native code usage and adds support for the other direction to it in mostly trivial manner.
In addition to that, assembler routines for storing argument values to the interpreter stack were added.
The CallStubGenerator generates a list of routines to copy the arguments from CPU registers and stack to the interpreter stack. The last one makes call to the ExecuteInterpretedMethod and then puts the result into appropriate registers.
For functions that return result via a return buffer, the buffer is passed to the ExecuteInterpretedMethod so that the IR opcode to return valuetype stores it directly to the return buffer.

The ARM64 for Apple OSes is the most optimized version, as it is the primary target where the performance matters the most. It eliminates argument registers saving to stack on the fast path when we already have the call stub.

Copilot

Pull Request Overview

This PR introduces comprehensive support for interpreter-to-JIT and JIT-to-interpreter calls, extending argument passing and return value handling to all kinds of method signatures. Key changes include new test methods in the interpreter tests, adjustments in call stub generation and assembly routines, and updates to thread-local and ABI-related constants.

Reviewed Changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
src/tests/JIT/interpreter/Interpreter.cs	Added several new test methods for various calling conventions to exercise interpreter/JIT interop.
src/coreclr/vm/threads.*	Modified thread-local definitions and moved the interpreter thread context member from private to public.
src/coreclr/vm/prestub.cpp	Updated the signature and return value handling of ExecuteInterpretedMethod.
src/coreclr/vm/interpexec.cpp	Added CreateNativeToInterpreterCallStub with adjustments to generate call stubs and ensure thread context.
src/coreclr/vm/callstubgenerator.h	Extended the call stub generator with new routines and a parameter to distinguish call directions.
asmconstants.h, AsmHelpers.asm, and PAL assembly macros	Updated ABI constants and assembly routines to support the new calling conventions.
src/coreclr/interpreter/interpretershared.h	Updated the InterpMethod structure to include the call stub pointer.

Comments suppressed due to low confidence (1)

src/coreclr/vm/interpexec.cpp:66

[nitpick] The AllocMemTracker variable is redeclared here, shadowing the one declared on line 56. To improve clarity and avoid potential confusion, consider reusing the outer 'amTracker' instead of redeclaring it.

        AllocMemTracker amTracker;

dotnet-policy-service · 2025-06-05T22:34:08Z

Tagging subscribers to this area: @BrzVlad, @janvorli, @kg
See info in area-owners.md if you want to be subscribed.

Until now, we only had support for calling interpreted void returning method with no arguments by the JIT/AOT. This change enables support for passing all possible kinds of arguments and returning all types. It reuses the `CallStubGenerator` that was implemented for the interpreter to native code usage and adds support for the other direction to it in mostly trivial manner. In addition to that, assembler routines for storing argument values to the interpreter stack were added. The `CallStubGenerator` generates a list of routines to copy the arguments from CPU registers and stack to the interpreter stack. The last one makes call to the `ExecuteInterpretedMethod` and then puts the result into appropriate registers. For functions that return result via a return buffer, the buffer is passed to the `ExecuteInterpretedMethod` so that the IR opcode to return valuetype stores it directly to the return buffer. The ARM64 for Apple OSes is the most optimized version, as it is the primary target where the performance matters the most. It eliminates argument registers saving to stack on the fast path when we already have the call stub.

src/coreclr/pal/inc/unixasmmacrosamd64.inc

jkotas · 2025-06-06T14:03:28Z

src/coreclr/vm/amd64/AsmHelpers.asm

+        ; Interpreter-TODO: consider not storing the argument registers in the PROLOG_WITH_TRANSITION_BLOCK and
+        ; store / restore them only around this call. We would still need to save RCX and RDX as they can contain
+        ; the return buffer address.
+        call            CreateNativeToInterpreterCallStub


I do not see how the arguments get reported to the GC while CreateNativeToInterpreterCallStub is executing. Is it a GC hole?

Would it be better to compile the transition helper in PreStubWorker? PreStubWorker gets called whenever we are about the execute a method from JIT/AOT code that is not compiled/prepared yet, and it has all the right infrastructure to protect arguments. It would be best to avoid duplicating it.

That's a good point and compiling it in the PreStubWorker would definitely be better.

kg · 2025-06-06T18:00:46Z

src/coreclr/vm/amd64/AsmHelpers.asm

+
+LEAF_ENTRY Store_XMM2_XMM3, _TEXT
+        movsd real8 ptr [r10], xmm2
+        movsd real8 ptr [r10 + 8], xmm3


aren't xmm registers 16 bytes?

I'm guessing what's going on here is there's an array of individual pointers at r10 for the location of the data for each xmm reg but wanted to make sure

In this case we are handling the XMM registers for the case where the 16 byte XMM register is just being used for an 8 byte floating point value. I'm a bit curious if this works well for floats as well as doubles, but the concept is sound. (I could see that perhaps our interpreter stack models all floating point values as doubles, which will introduce floating point differences from what the JIT does in certain cases, but it is pretty close.

More interestingly though I don't see handling for Vector64/Vector128 on Arm64 which should be treated as HFA values. I don't think this all needs to be handled before the PR is merged, but we should have it on our worklist.

The xmm registers are 16 bytes long, but for argument passing purposes, they can only carry either float or double arguments, so we use max 8 bytes out of it. See https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170.
For Unix, it is the same case. The calling convention would allow larger data types passed in all 16 bytes, but that's not used in .NET (In C, it can pass arguments of types __float128, _Decimal128 and __m128)

I would love to see one comment in here somewhere to the effect of 'we only use up to the first 8 bytes of the xmm registers due to the x64 calling convention' just so anyone looking at this can go 'oh, that makes sense' and look up the calling convention for more info (i.e. that vectors are passed by-reference)

Otherwise, looks good, thanks for the explanation.

davidwrighton

Overall, this looks good. I suspect we haven't fleshed out all the calling convention gremlins, but this should get most of it.

davidwrighton · 2025-06-06T18:53:04Z

src/coreclr/vm/amd64/AsmHelpers.asm

+
+LEAF_ENTRY Store_XMM2_XMM3, _TEXT
+        movsd real8 ptr [r10], xmm2
+        movsd real8 ptr [r10 + 8], xmm3


In this case we are handling the XMM registers for the case where the 16 byte XMM register is just being used for an 8 byte floating point value. I'm a bit curious if this works well for floats as well as doubles, but the concept is sound. (I could see that perhaps our interpreter stack models all floating point values as doubles, which will introduce floating point differences from what the JIT does in certain cases, but it is pretty close.

More interestingly though I don't see handling for Vector64/Vector128 on Arm64 which should be treated as HFA values. I don't think this all needs to be handled before the PR is merged, but we should have it on our worklist.

davidwrighton · 2025-06-06T18:54:42Z

src/coreclr/interpreter/interpretershared.h

@@ -28,6 +30,7 @@ struct InterpMethod
    CORINFO_METHOD_HANDLE methodHnd;
    int32_t allocaSize;
    void** pDataItems;
+    CallStubHeader *pCallStub;


This should have a comment indicating its for use for calling from the jitted calling convention into the interpreter

I'll add one.

janvorli · 2025-06-06T19:05:34Z

I'm a bit curious if this works well for floats as well as doubles
@davidwrighton the interpreter distinguishes between floats and doubles, it knows what each slot contains. So in case the argument is float, we just write in 8 bytes, but the interpreter only ever uses the low 4 bytes.

janvorli · 2025-06-06T19:17:07Z

More interestingly though I don't see handling for Vector64/Vector128 on Arm64 which should be treated as HFA values.

Good point. I think that the only one that would not work is the Vector128. For that one, we will need to add routines for both storing and loading q0..q7. The Vector64 should be handled by the regular d0..d7 storing / loading routines.

jkotas · 2025-06-07T02:01:20Z

src/coreclr/vm/prestub.cpp

@@ -432,6 +432,8 @@ PCODE MethodDesc::PrepareILBasedCode(PrepareCodeConfig* pConfig)
 #ifdef FEATURE_INTERPRETER
        if (pConfig->IsInterpreterCode())
        {
+            CreateNativeToInterpreterCallStub(pCode);


Is this going to create the stub only for the methods that are actually called from native code, or for all interpreted methods? If it is the later, it sounds wasteful.

jkotas · 2025-06-07T02:18:16Z

src/coreclr/pal/inc/unixasmmacrosamd64.inc

@@ -405,3 +411,25 @@ C_FUNC(\Name\()_End):
        free_stack      56
        POP_CALLEE_SAVED_REGISTERS
 .endm
+
+.macro INLINE_GET_TLS_VAR Var


This is duplicated a few lines above in INLINE_GET_ALLOC_CONTEXT_BASE. Move the INLINE_GET_TLS_VAR macro up and switch INLINE_GET_ALLOC_CONTEXT_BASE to use it?

jkotas · 2025-06-07T02:18:43Z

src/coreclr/pal/inc/unixasmmacrosarm64.inc

+// Loads the address of a thread-local variable into the target register,
+// which cannot be x0.
+// Preserves registers except for xip0 and xip1 on Apple
+.macro INLINE_GET_TLS_VAR target, var


janvorli added this to the 10.0.0 milestone Jun 5, 2025

janvorli self-assigned this Jun 5, 2025

Copilot AI review requested due to automatic review settings June 5, 2025 22:33

janvorli added the area-CodeGen-Interpreter-coreclr label Jun 5, 2025

janvorli requested review from BrzVlad and kg as code owners June 5, 2025 22:33

Copilot AI reviewed Jun 5, 2025

View reviewed changes

janvorli force-pushed the jit-to-interpreter-calls branch from 8ed6861 to df8d547 Compare June 6, 2025 00:24

kg reviewed Jun 6, 2025

View reviewed changes

src/coreclr/pal/inc/unixasmmacrosamd64.inc Show resolved Hide resolved

build-analysis bot mentioned this pull request Jun 6, 2025

[android] Android.Device_Emulator.JIT.Test failing on emulators with CoreCLR #112633

Open

janvorli added 2 commits June 6, 2025 11:05

Fix m_pInterpThreadContext offset after rebase to main

0cb2df6

Fix m_pInterpThreadContext for more targets

dcb0f4b

jkotas reviewed Jun 6, 2025

View reviewed changes

kg reviewed Jun 6, 2025

View reviewed changes

Move the CreateNativeToInterpreterCallStub to prestub

aa8ad4b

davidwrighton approved these changes Jun 6, 2025

View reviewed changes

Fix build break and add a comment

401bf92

build-analysis bot mentioned this pull request Jun 7, 2025

System.Diagnostics.Process.Tests failure on net10.0-linux-Debug-arm64-Mono_Minijit_Debug #110643

Open

jkotas reviewed Jun 7, 2025

View reviewed changes

JIT/AOT to interpreter calls support #116353

Are you sure you want to change the base?

JIT/AOT to interpreter calls support #116353

Conversation

janvorli commented Jun 5, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

dotnet-policy-service bot commented Jun 5, 2025

Uh oh!

Uh oh!

jkotas Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davidwrighton left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

janvorli commented Jun 6, 2025

Uh oh!

janvorli commented Jun 6, 2025

Uh oh!

jkotas Jun 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jkotas Jun 6, 2025 •

edited

Loading

jkotas Jun 7, 2025 •

edited

Loading