-
Notifications
You must be signed in to change notification settings - Fork 427
Nutanix-ENG-741981 [vioscsi] Fix NMI in crashdump/hibernation pathway #1294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@MartinCHarvey-Nutanix Please look at this fix |
He's on PTO for a bit. |
ac33247 to
a4f42d0
Compare
|
Can we please rerun tests on this one? I'm confident it should be ready. |
|
This looks OK to me. @sb-ntnx can you give it a look? |
|
Maybe some |
Yes, all of them are in the queue |
|
Is the |
I note that this test passed in PR #1293... cc: @JonKohler @sb-ntnx |
a4f42d0 to
77e8f12
Compare
|
My latest thoughts on this are: Perhaps the init of This was because I used My latest cut (which I just rebased here) and Martin's are getting very close to being the same. I guess I'm preferring mine over minor semantics. I'd like to retain a reference to Is @MartinCHarvey-Nutanix back from PTO... It would be good to get his view on the differences. Anyone else? @kostyanf14, could we please re-run the test to be sure? |
It will be very telling if it passes... I note PR #1293 also had: a) Both of these also skipped some tests. |
|
Static Tools Logo Test unexpected. For some reason SDV crashed for vioscsi driver. |
Thanks, Kostiantyn. Just to be clear, was the |
|
Flush Test failure is expected in general. We try to emulate force off of VMs and sometimes HLK is not happy. If the test passed - this is good, if fail - not a bug. |
|
@JonKohler @sb-ntnx @MartinCHarvey-Nutanix @MartinCHarvey In light of @kostyanf14's comments re the I am now also minded to perhaps create a custom Probably something like: typedef enum _CUSTOM_STOR_SPINLOCK {
Skip_Locking = 0,
No_Lock = 0,
#if defined(NTDDI_WIN11_GE) && (NTDDI_VERSION >= NTDDI_WIN11_GE)
Invalid_Lock = InvalidLock,
#else
Invalid_Lock = 0,
#endif
Dpc_Lock = DpcLock,
StartIo_Lock = StartIoLock,
Interrupt_Lock = InterruptLock,
ThreadedDpc_Lock = ThreadedDpcLock,
DpcLevel_Lock = DpcLevelLock
} CUSTOM_STOR_SPINLOCK;I think it is highly likely a I'll start putting that together, but in the meantime please do share any thoughts you have about the above. Also, who discovered the regression? |
77e8f12 to
4708260
Compare
… (regression) Credit to Nutanix and in particular @MartinCHarvey-Nutanix for his work in PR virtio-win#1293. Background: We previously ignored calls for a spinlock with isr=TRUE in VioScsiVQLock() and VioScsiVQUnlock(). This was replaced with a call to InterruptLock in the (!IsCrashDumpMode && adaptExt->dpc_ok) = FALSE pathway. In testing, suspend/resume/hibernate did not use this pathway but instead issued DPCs. The InterruptLock was presumed to be used when IsCrashDumpMode=TRUE. Also, using PVOID LockContext = NULL, and / or then setting LockContext to &adaptExt->dpc[vq_req_idx], appears to cause a HCK Flush Test failure. Created new overloaded enumeration called CUSTOM_STOR_SPINLOCK which adds some new (invalid) spinlock types such as Skip_Locking and No_Lock. Also provides InvalidLock for builds prior to NTDDI_WIN11_GE (Windows 11, version 24H2, build 26100) via Invalid_Lock. In similar vein, Dpc_Lock = DpcLock, StartIo_Lock = StartIoLock, Interrupt_Lock = InterruptLock, ThreadedDpc_Lock = ThreadedDpcLock & ThreadedDpc_Lock = ThreadedDpcLock. This fix has two components: 1. Only DpcLock type spinlocks are processed in ProcessBuffer() with all other types presently being ignored ; and 2. The (PVOID)LockContext is no longer used, with calls to StorPortAcquireSpinLock() for DpcLock type spinlocks using &adaptExt->dpc[vq_req_idx] directly. Note: Use of InvalidLock requires Win11 and both InvalidLock and DpcLevelLock require using StorPortAcquireSpinLockEx. Consider for future use. Signed-off-by: benyamin-codez <[email protected]>
4708260 to
09adb8d
Compare
|
@JonKohler @sb-ntnx @MartinCHarvey-Nutanix @MartinCHarvey Please find my new cut, crafted and pushed per the above post, for your consideration. Maybe we should try |
|
P.S. : |
wrt your power-loss emulation code for the |
|
Apologies for clobbering the previous |
Please see https://github.com/HCK-CI/AutoHCK/ in general and HCK-CI/AutoHCK#486 specific for this test |
|
Looks like
Ready for this one? |
|
I haven't yet had a chance to catch up on this thread, but I wanted to check: Have we addressed the case where this happens at very initial driver load and device start? My understanding is that the ISR gets called after initial device addition, but before the DPC has been set up. Arguably the crash dump path is being invoked at a time when it should not, and we need something smarter than just a "dpc_ok" variable: something to register the pending interrput, and interlocked or in a flag. When the dpc is setup, we can immediate queue a dpc to process it in the normal way.
I will test this and dump cases on my return to office.
MH
Sent from Outlook for Android<https://aka.ms/AAb9ysg>
…________________________________
From: benyamin-codez ***@***.***>
Sent: Saturday, February 22, 2025 3:56:18 AM
To: virtio-win/kvm-guest-drivers-windows ***@***.***>
Cc: Martin Christopher Harvey ***@***.***>; Mention ***@***.***>
Subject: Re: [virtio-win/kvm-guest-drivers-windows] Nutanix-ENG-741981 [vioscsi] Fix NMI in crashdump/hibernation pathway (PR #1294)
@JonKohler<https://github.com/JonKohler> @sb-ntnx<https://github.com/sb-ntnx> @MartinCHarvey-Nutanix<https://github.com/MartinCHarvey-Nutanix> @MartinCHarvey<https://github.com/MartinCHarvey>
In light of @kostyanf14<https://github.com/kostyanf14>'s comments re the Flush Test above, the fact the same just passed, and a little time to consider other options, I am minded to leave those last edits as-is. The relevant functions in my WIP also had a STOR_SPINLOCK parameter, which I removed when I realised those functions would never use anything but DpcLock type spinlocks moving forward, but I obviously did leave some artefacts behind, which I guess should be cleaned up anyway.
I am now also minded to perhaps create a custom STOR_SPINLOCK-based enumeration, perhaps named CUSTOM_STOR_SPINLOCK, which could then include a Skip_Locking=0 or No_Lock=0 or Invalid_Lock=0 spinlock type for use in this scenario, or which would otherwise mimic the Win11 24H2 enumeration (which uses InvalidLock=0).
Probably something like:
typedef enum _CUSTOM_STOR_SPINLOCK {
Skip_Locking = 0,
No_Lock = 0,
#if defined(NTDDI_WIN11_GE) && (NTDDI_VERSION >= NTDDI_WIN11_GE)
Invalid_Lock = InvalidLock,
#else
Invalid_Lock = 0,
#endif
Dpc_Lock = DpcLock,
StartIo_Lock = StartIoLock,
Interrupt_Lock = InterruptLock,
ThreadedDpc_Lock = ThreadedDpcLock,
DpcLevel_Lock = DpcLevelLock
} CUSTOM_STOR_SPINLOCK;
I think it is highly likely a ThreadedDpcLock-based solution will be considered as some point in the not-to-distant future. I had some success with this in my WIP (an unpublished variant). For this reason, and as mentioned above, i.e. to help ensure the spinlock context is considered when calling the ProcessBuffer() function, I think we should use a STOR_SPINLOCK-based parameter rather than the BOOLEAN that Martin has used in the other PR.
I'll start putting that together, but in the meantime please do share any thoughts you have about the above.
Also, who discovered the regression?
I would like to give credit to Martin and whomever else might be deserving in the commit message.
—
Reply to this email directly, view it on GitHub<#1294 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ANJKGXCEO5TIGFKAMDGD2D32RA3TFAVCNFSM6AAAAABXHG6XXWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNZWGEYDMOBSGA>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
[benyamin-codez]benyamin-codez left a comment (virtio-win/kvm-guest-drivers-windows#1294)<#1294 (comment)>
@JonKohler<https://github.com/JonKohler> @sb-ntnx<https://github.com/sb-ntnx> @MartinCHarvey-Nutanix<https://github.com/MartinCHarvey-Nutanix> @MartinCHarvey<https://github.com/MartinCHarvey>
In light of @kostyanf14<https://github.com/kostyanf14>'s comments re the Flush Test above, the fact the same just passed, and a little time to consider other options, I am minded to leave those last edits as-is. The relevant functions in my WIP also had a STOR_SPINLOCK parameter, which I removed when I realised those functions would never use anything but DpcLock type spinlocks moving forward, but I obviously did leave some artefacts behind, which I guess should be cleaned up anyway.
I am now also minded to perhaps create a custom STOR_SPINLOCK-based enumeration, perhaps named CUSTOM_STOR_SPINLOCK, which could then include a Skip_Locking=0 or No_Lock=0 or Invalid_Lock=0 spinlock type for use in this scenario, or which would otherwise mimic the Win11 24H2 enumeration (which uses InvalidLock=0).
Probably something like:
typedef enum _CUSTOM_STOR_SPINLOCK {
Skip_Locking = 0,
No_Lock = 0,
#if defined(NTDDI_WIN11_GE) && (NTDDI_VERSION >= NTDDI_WIN11_GE)
Invalid_Lock = InvalidLock,
#else
Invalid_Lock = 0,
#endif
Dpc_Lock = DpcLock,
StartIo_Lock = StartIoLock,
Interrupt_Lock = InterruptLock,
ThreadedDpc_Lock = ThreadedDpcLock,
DpcLevel_Lock = DpcLevelLock
} CUSTOM_STOR_SPINLOCK;
I think it is highly likely a ThreadedDpcLock-based solution will be considered as some point in the not-to-distant future. I had some success with this in my WIP (an unpublished variant). For this reason, and as mentioned above, i.e. to help ensure the spinlock context is considered when calling the ProcessBuffer() function, I think we should use a STOR_SPINLOCK-based parameter rather than the BOOLEAN that Martin has used in the other PR.
I'll start putting that together, but in the meantime please do share any thoughts you have about the above.
Also, who discovered the regression?
I would like to give credit to Martin and whomever else might be deserving in the commit message.
—
Reply to this email directly, view it on GitHub<#1294 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ANJKGXCEO5TIGFKAMDGD2D32RA3TFAVCNFSM6AAAAABXHG6XXWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNZWGEYDMOBSGA>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
|
@MartinCHarvey-Nutanix @MartinCHarvey @sb-ntnx Thanks for the post, Martin.
I have been unable to reproduce this issue, so I have presumed you are using something funky in your setup...
ISR = iirc, the codepath for setting it up is:
Where: Perhaps If Is this the case with your setup? I ask because the DPC infrastructure should be well and truly set up, by the time the In the case DPCs are supported, and we happen to use a So historically, we have skipped the spinlock. This PR restores this previous behaviour. Please let me know if you can still reproduce your issue even with this PR. if (!adaptExt->dump_mode && adaptExt->dpc_ok)
{
NT_ASSERT(MessageId >= QUEUE_TO_MESSAGE(VIRTIO_SCSI_REQUEST_QUEUE_0));
StorPortIssueDpc(DeviceExtension,
&adaptExt->dpc[MessageId - QUEUE_TO_MESSAGE(VIRTIO_SCSI_REQUEST_QUEUE_0)],
ULongToPtr(MessageId),
ULongToPtr(MessageId));
}
else if (adaptExt->dump_mode)
{
ProcessBuffer(DeviceExtension, MessageId, Skip_Locking);
}
else
{
ProcessBuffer(DeviceExtension, MessageId, Interrupt_Lock);
}...or some other variant + code in |
… (regression) Credit to Nutanix and in particular @MartinCHarvey-Nutanix for his work in PR #1293. Background: We previously ignored calls for a spinlock with isr=TRUE in VioScsiVQLock() and VioScsiVQUnlock(). This was replaced with a call to InterruptLock in the (!IsCrashDumpMode && adaptExt->dpc_ok) = FALSE pathway. In testing, suspend/resume/hibernate did not use this pathway but instead issued DPCs. The InterruptLock was presumed to be used when IsCrashDumpMode=TRUE. Also, using PVOID LockContext = NULL, and / or then setting LockContext to &adaptExt->dpc[vq_req_idx], appears to cause a HCK Flush Test failure. Created new overloaded enumeration called CUSTOM_STOR_SPINLOCK which adds some new (invalid) spinlock types such as Skip_Locking and No_Lock. Also provides InvalidLock for builds prior to NTDDI_WIN11_GE (Windows 11, version 24H2, build 26100) via Invalid_Lock. In similar vein, Dpc_Lock = DpcLock, StartIo_Lock = StartIoLock, Interrupt_Lock = InterruptLock, ThreadedDpc_Lock = ThreadedDpcLock & ThreadedDpc_Lock = ThreadedDpcLock. This fix has two components: 1. Only DpcLock type spinlocks are processed in ProcessBuffer() with all other types presently being ignored ; and 2. The (PVOID)LockContext is no longer used, with calls to StorPortAcquireSpinLock() for DpcLock type spinlocks using &adaptExt->dpc[vq_req_idx] directly. Note: Use of InvalidLock requires Win11 and both InvalidLock and DpcLevelLock require using StorPortAcquireSpinLockEx. Consider for future use. Signed-off-by: benyamin-codez <[email protected]>
|
@MartinCHarvey-Nutanix @MartinCHarvey @sb-ntnx The one variant I didn't try was: if (!adaptExt->dump_mode && adaptExt->dpc_ok)
{
...
}
else
{
ProcessBuffer(DeviceExtension, MessageId, Dpc_Lock);
}..., which in some circumstances, might actually work... |
I suspect this may be the case. I'm testing the case where there is a totally "bare" windows VM, which has a vioscsi device, and additionally (I guess), there are pending events / Io's / something on the device at the moment where a driver is very first installed.
Possibly. Nutanix runs storage stuff on the host which is not "vanilla upstream".
Yes, dpc_ok strikes me as a bit ambigous. I think a bit of instrumentation with some dpc_state variable might be a good idea.
I'm willing to believe this generally shouldn't happen in most/all normal use cases. Possibly the dump/hiber path doesn't allow DPC's in the normal case. I'll get back to you on that.
Hmmm. I notice there's a "virtio_device_ready" at the end of HwInitialize, but before passive initialization has completed, and strictly speaking (I haven't checked the virtio spec) we might not want to indicate fully ready until passive init has succeded or failed. Maybe some interrupt not strictly associated with a vm/client side request?
Hmmm. OK. Some concrete evidence and instrumentation / tracing is required. Leave it with me, and I'll get back to you in the next day or two on that. |
|
OK, suspicions confirmed. I added a pending flag or two and a bit of debug. Here are my much abbreviated logs. I think it may be time to move that virtio_device_ready call to the end of passive initialization. I'll do that, check it all works in the normal case, then check the crashdump / hibernate case, then remove all my debug, and then update my PR :-) |
|
@benyamin-codez These issues also apply to virtio-stor. However, I think this is slightly overkill, because: Stuff done under a dpc lock is unlikely to be under any other sort of lock. So I think it would be better to fix the (small) sync issues with a small code change, and make that change common between virtio-scsi and virtio-stor. |
|
Thanks for the updates..! I started getting It's perhaps unrelated, but I remain curious as to why your environment is calling: Did you find a cause for why the MSI is being raised so early...?
In the event that passive initialisation doesn't occur, i.e. |
It was unrelated... Any joy moving that I was thinking it might be the case that it is required where it is, ... ...so I'm very curious to see where you might call it from, and if that works. |
|
@benyamin-codez Give me a mo to retest both normal usage and crashdump. I'll update my PR with a couple of commits, so you can cherry-pick one of them across. |
|
Yeah,
https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/storport/nf-storport-storportenablepassiveinitialization
Is not particularly forthcoming about which old os versions don't support DPC's so I assume it's from some ancient version. Everything NT based does.
If you look at the code, failure of passive initialization results in failed HW Init, and I'm leaving that unchanged. It should be fine to delay the "ready" notification for a bit.
Still hoop jumping to get this to host a boot drive. Getting it to host the location of the dump file should be easier. Feedback RSN.
Sent from Outlook for Android<https://aka.ms/AAb9ysg>
…________________________________
From: benyamin-codez ***@***.***>
Sent: Wednesday, February 26, 2025 5:25:43 AM
To: virtio-win/kvm-guest-drivers-windows ***@***.***>
Cc: Martin Christopher Harvey ***@***.***>; Mention ***@***.***>
Subject: Re: [virtio-win/kvm-guest-drivers-windows] Nutanix-ENG-741981 [vioscsi] Fix NMI in crashdump/hibernation pathway (PR #1294)
@MartinCHarvey-Nutanix<https://github.com/MartinCHarvey-Nutanix>
I started getting DRIVER_IRQL_NOT_LESS_OR_EQUAL in ntoskrnl.exe at driver installation (in my vioscsi WIP) a few hours back following clang-format changes... 8^d
It's perhaps unrelated...
It was unrelated...
... 8^d
... ... 8^D
Any joy moving that virtio_device_ready() routine...?
I was thinking it might be the case that it is required where it is, ...
...i.e. before StorPort calls HW_PASSIVE_INITIALIZE_ROUTINE (i.e. VioScsiPassiveInitializeRoutine())
...i.e. in HW_INITIALIZE (i.e. VioScsiHwInitialize())...
...so I'm very curious to see where you might call it from, and if that works.
—
Reply to this email directly, view it on GitHub<#1294 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ANJKGXDQJVLQAUJ3SBUTGDD2RVF5PAVCNFSM6AAAAABXHG6XXWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOBTHEZDSNBXG4>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
[benyamin-codez]benyamin-codez left a comment (virtio-win/kvm-guest-drivers-windows#1294)<#1294 (comment)>
@MartinCHarvey-Nutanix<https://github.com/MartinCHarvey-Nutanix>
I started getting DRIVER_IRQL_NOT_LESS_OR_EQUAL in ntoskrnl.exe at driver installation (in my vioscsi WIP) a few hours back following clang-format changes... 8^d
It's perhaps unrelated...
It was unrelated...
... 8^d
... ... 8^D
Any joy moving that virtio_device_ready() routine...?
I was thinking it might be the case that it is required where it is, ...
...i.e. before StorPort calls HW_PASSIVE_INITIALIZE_ROUTINE (i.e. VioScsiPassiveInitializeRoutine())
...i.e. in HW_INITIALIZE (i.e. VioScsiHwInitialize())...
...so I'm very curious to see where you might call it from, and if that works.
—
Reply to this email directly, view it on GitHub<#1294 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ANJKGXDQJVLQAUJ3SBUTGDD2RVF5PAVCNFSM6AAAAABXHG6XXWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOBTHEZDSNBXG4>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
|
FYI, I'm prepping to close this in favour of Martin's PR #1293. |
|
Closing this PR in favour of #1293. |
Purpose: To resolve regression introduced in PR #1196 / #1175 (fe07e50).
Background: We previously ignored calls for a spinlock with
isr=TRUEinVioScsiVQLock()andVioScsiVQUnlock(). This was replaced with a call to useInterruptLocktype spinlocks in the(!IsCrashDumpMode && adaptExt->dpc_ok) = FALSEpathway. In testing, suspend/resume/hibernate did not use this pathway but instead issued DPCs. TheInterruptLockwas presumed to be used whenIsCrashDumpMode=TRUE. Also, usingPVOID LockContext = NULL, and / or then settingLockContextto&adaptExt->dpc[vq_req_idx], appears to cause aHCK Flush Testfailure.This preliminary fix proposes to useDpcLevelLockwhenadaptExt->dpc_ok=TRUEandInterruptLockwhenadaptExt->dpc_ok=FALSE.This fix ignores all requests forInterruptLocktype spinlocks inProcessBuffer().This fix has two components:
DpcLocktype spinlocks are processed inProcessBuffer()with the other valid types (InterruptLockandStartIoLock) being ignored ; and(PVOID)LockContextis no longer used, with calls toStorPortAcquireSpinLock()forDpcLocktype spinlocks using&adaptExt->dpc[vq_req_idx]directly.Note: Use of
InvalidLockrequires Win11 and bothInvalidLockandDpcLevelLockrequire usingStorPortAcquireSpinLockEx(). Consider for future use.Added: Created new overloaded enumeration called
CUSTOM_STOR_SPINLOCKwhich adds some new (invalid) spinlock types such asSkip_LockingandNo_Lock. Also providesInvalidLockfor builds prior toNTDDI_WIN11_GE(Windows 11, version 24H2, build 26100) viaInvalid_Lock. In similar vein,Dpc_Lock = DpcLock,StartIo_Lock = StartIoLock,Interrupt_Lock = InterruptLock,ThreadedDpc_Lock = ThreadedDpcLock&DpcLevel_Lock = DpcLevelLock.Credit to Nutanix and in particular @MartinCHarvey-Nutanix for his work in PR #1293.