Skip to content

Interpreter shared generics support #116357

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

davidwrighton
Copy link
Member

@davidwrighton davidwrighton commented Jun 6, 2025

Add support for shared generics to the interpreter

  • Handle shared generics calling convention details for the newobj, and call opcodes
  • Swap over to using getCallInfo on the jit interface to get information about how a call should be handled
  • Add new instructions to resolve a generic dictionary lookup (generic.method, generic.class, generic.this)
  • Add new instructions ldtoken.var and newobj.var which can use the result of a generic dictionary lookup. These are variants on ldtoken and newobj instructions.
    • we will eventually need variants for newarr, box, unbox.any, unbox, call.helper.pp and call.helper.pp2
  • Rework how call opcodes work so that it uses GetMultiCallableAddrOfCode for dispatch instead of the NativeCode on a method, and stop re-using the NativeCode field for interpreter things, and instead add a new field to MethodDesc for that. This made it simple to enable support for string constructors.
  • Also some drive by improvements to improve diagnostics. Notably startIp now points at a strongly typed struct which allows much easier examination of the interpreter stack using the debugger.

This PR needs a bit of work, notably the FillTempVarWithToken api used to implement the shared generic lookups is a really bad api, but now that its built and we have some uses, I'd like to refactor into something more useable. There are also incomplete features like full static virtual dispatch support and gvms, but this change is quite large enough already.

1. Handle calling convention for CEE_CALLing
2. Use getCallInfo for handling the call target resolution
3. Add handling for the param type arg as well as an ability to use the generic dictionary
4. Add a new ldtoken interpreter instruction which can take the ldtoken input as a var instead of as a data item

- Not done
 -  Anything to do with newobj
 - Anything to do with calli or the CODEPOINTER path
 - Anything to do with the LDVIRTFTN path
@Copilot Copilot AI review requested due to automatic review settings June 6, 2025 00:16
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Add support for shared generics in the interpreter by extending call dispatch, adding new lookup instructions, and updating metadata & tests.

  • Handle shared generics calling conventions in newobj/call, add generic.* and .var instructions
  • Store interpreter bytecode pointer in MethodDesc and switch dispatch to GetMultiCallableAddrOfCode
  • Extend tests (TestStringCtor, TestSharedGenerics) and use a typed startIp for better diagnostics

Reviewed Changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/tests/JIT/interpreter/Interpreter.cs Added verbose Console.WriteLine before each test and new shared‐generics tests
src/coreclr/vm/prestub.cpp Store interpreter code pointer into MethodDesc and update frame startIp
src/coreclr/vm/method.hpp Added m_interpreterCode field and accessors for interpreter bytecode
src/coreclr/vm/jitinterface.cpp Added CORINFO_CALLINFO_DISALLOW_STUB branch in getCallInfo
src/coreclr/vm/jithelpers.cpp Split out GenericHandleWorkerCore and wrapped with QCall
src/coreclr/vm/interpexec.h & .cpp Replaced raw int32_t* with PTR_InterpByteCodeStart, updated dispatch to bytecode
src/coreclr/vm/callstubgenerator.h & .cpp Changed ProcessArgument to take pointer, handle string ctor convention
src/coreclr/vm/amd64/asmconstants.h Adjusted offset for InstantiatedMethodDesc::m_pPerInstInfo
src/coreclr/inc/corinfo.h Defined new CORINFO_CALLINFO_DISALLOW_STUB flag
src/coreclr/interpreter/intops.h & .def Introduced InterpOpGenericLookup, InterpOpLdPtr, new lookup opcodes
src/coreclr/interpreter/interpretershared.h Added InterpByteCodeStart struct to hold method pointer + bytecode
src/coreclr/interpreter/compiler.h Updated EmitCall signature to support shared‐generics
Comments suppressed due to low confidence (1)

src/tests/JIT/interpreter/Interpreter.cs:386

  • [nitpick] The repeated Console.WriteLine and Environment.FailFast pattern clutters the test runner; consider refactoring into a helper method or iterating over a list of test names to reduce duplication.
        Console.WriteLine("Sum");

@@ -458,7 +463,7 @@ class InterpCompiler
void EmitUnaryArithmeticOp(int32_t opBase);
void EmitShiftOp(int32_t opBase);
void EmitCompareOp(int32_t opBase);
void EmitCall(CORINFO_CLASS_HANDLE constrainedClass, bool readonly, bool tailcall);
void EmitCall(CORINFO_RESOLVED_TOKEN* constrainedClass, bool readonly, bool tailcall, bool newObj);
Copy link
Preview

Copilot AI Jun 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signature of EmitCall was updated in the header but its implementation in compiler.cpp has not been adjusted; this mismatch will cause a build failure. Update the definition in the .cpp file to match the new parameters.

Copilot uses AI. Check for mistakes.


enum class GenericHandleEmbedOptions
{
support_use_as_flags = -1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nit - it would be nice to use consistent naming convention.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a magic value that makes it possible (in conjunction with enum_class_flags.h to use | and & and the HasFlag helper method with an enum class. I'll add a comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't know we had that magic stuff :-)

#else
DPTR(InterpMethod) const InterpMethod; // Pointer to the InterpMethod structure
#endif
const int* GetByteCodes() const
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interpreter loop uses explicit int32_t, it would be nice to be consistent with it here too.

@@ -130,7 +144,8 @@ void InterpExecMethod(InterpreterFrame *pInterpreterFrame, InterpMethodContextFr
}

int32_t returnOffset, callArgsOffset, methodSlot;
const int32_t *targetIp;
PTR_InterpByteCodeStart targetIp;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need the PTR_ here when this is never compiled for DAC?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not. The regular * would be good enough for this non-dac code. I added the concept of PTR_InterpByteCodeStart as that will at some point be needed by the DAC to understand the call stack, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the PTR_InterpByteCodeStart at other places is fine and useful. I would prefer plain pointer in code that's not DAC related if you don't mind.


offset = 0;

INTERP_DUMP("\nCreate IL Vars:\n");

CORINFO_ARG_LIST_HANDLE sigArg = m_methodInfo->args.args;
for (int i = 0; i < numArgs; i++) {
for (int i = 0; i < (numArgs + hasParamArg); i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to understand why we need to have the paramArg variable index to be somewhere else (after the IL locals in our case), but offset-wise, it is still located at the first or second slot of the arguments. Why do we care about the actual index of the variable not being in the range of indices of the other arguments?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that so that the handling of ldarg and ldloc could remain with its current logic (where ldarg variable indices exactly match that of the IL args, and ldloc indices are offset from zero by exactly the count number of IL args. This could of course be finagled around, but I thought this was the simplest solution. If you'd prefer to have a more complex mapping for those, I can do that too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, I am fine with it. I just wanted to understand "why".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It certainly deserves a comment, since it's fairly surprising.

Copy link
Member

@janvorli janvorli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@@ -130,7 +144,8 @@ void InterpExecMethod(InterpreterFrame *pInterpreterFrame, InterpMethodContextFr
}

int32_t returnOffset, callArgsOffset, methodSlot;
const int32_t *targetIp;
PTR_InterpByteCodeStart targetIp;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the PTR_InterpByteCodeStart at other places is fine and useful. I would prefer plain pointer in code that's not DAC related if you don't mind.

@@ -1358,7 +1358,7 @@ enum CORINFO_CALLINFO_FLAGS
CORINFO_CALLINFO_ALLOWINSTPARAM = 0x0001, // Can the compiler generate code to pass an instantiation parameters? Simple compilers should not use this flag
CORINFO_CALLINFO_CALLVIRT = 0x0002, // Is it a virtual call?
// UNUSED = 0x0004,
// UNUSED = 0x0008,
CORINFO_CALLINFO_DISALLOW_STUB = 0x0008, // Do not use a stub for this call, even if it is a virtual call.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Add this to the managed view as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants