Skip to content

Conversation

@vrozenfe
Copy link
Collaborator

the motivation for this sequence is to synchronise the upstream with our downstream code-base. We reverted a couple of previously merged commutes and QE run very intensive functional and WHQL testing on theses changes. In addition to that there are two similar changes for viostor and vioscsi drivers to reset the transfer length to 0 for SRBs completed during bus reset processing. We found that out without this change the system reports the problem incorrectly, saying that it was a MEMORY MANAGER CRC problem (happened during page-in operation).

Thanks,
Vadim.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @vrozenfe, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical issue where the system would incorrectly report "MEMORY MANAGER CRC problem" during page-in operations by ensuring that the data transfer length for SCSI Request Blocks (SRBs) is explicitly set to zero when these requests are completed under specific error conditions, such as bus resets, device removal, or invalid commands. The changes also involve a significant refactoring of queue processing and locking mechanisms within the vioscsi driver, standardizing spinlock usage and improving overall code structure and reliability. These updates synchronize the codebase with recent, thoroughly tested downstream modifications.

Highlights

  • Fix for Incorrect System Reports: Implemented a crucial fix to prevent incorrect "MEMORY MANAGER CRC problem" reports by ensuring the data transfer length of SCSI Request Blocks (SRBs) is consistently reset to zero when completed due to bus reset processing, device removal, or invalid requests across both viostor and vioscsi drivers.
  • Downstream Code Synchronization: Synchronized the upstream codebase with recent downstream changes, incorporating previously reverted commits and validated modifications that underwent intensive functional and WHQL testing.
  • Refactored Queue Locking Mechanism: Introduced dedicated VioScsiVQLock and VioScsiVQUnlock functions in the vioscsi driver to centralize and standardize queue-related spinlock acquisition and release, improving code clarity, maintainability, and robustness.
  • Unified Queue Processing Logic: Replaced the ProcessBuffer function with a new ProcessQueue function in the vioscsi driver, streamlining the handling of completed requests from virtqueues and integrating the new locking mechanism.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request synchronizes the viostor and vioscsi drivers with downstream changes, primarily focusing on resetting the transfer length to zero for SRBs completed during bus reset processing. The changes are logical and correctly implement the intended fix across various error and reset paths. The introduction of VioScsiVQLock and VioScsiVQUnlock helper functions is a good refactoring that improves code maintainability. I've included a few suggestions to further improve code quality by addressing minor issues like an unused parameter and inconsistent variable naming.


VOID
VioScsiVQUnlock(IN PVOID DeviceExtension, IN ULONG MessageID, IN PSTOR_LOCK_HANDLE LockHandle, IN BOOLEAN isr)
{

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The MessageID parameter is unused in this function. Consider removing it from the function signature to improve code clarity. This change would also need to be applied to the function declaration in vioscsi/helper.h and all call sites.

VioScsiVQUnlock(IN PVOID DeviceExtension, IN PSTOR_LOCK_HANDLE LockHandle, IN BOOLEAN isr)

Comment on lines +1257 to +1258
ULONG QueuNum;
ULONG MsgId;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's a typo in QueuNum; it should be QueueNum. Also, for consistency and clarity, consider using the full name MessageId instead of the abbreviation MsgId.

            ULONG QueueNum;
            ULONG MessageId;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for typo

-1 for renaming MsgId

ULONG QueueNumber = MESSAGE_TO_QUEUE(MessageId);
STOR_LOCK_HANDLE LockHandle = {0};
PADAPTER_EXTENSION adaptExt = (PADAPTER_EXTENSION)DeviceExtension;
ULONG index = MESSAGE_TO_QUEUE(MessageID);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The variable index is a bit generic. Since it represents a queue index, renaming it to queueIndex would make the code more self-documenting and easier to understand. This change should be applied to all uses of index in this function.

    ULONG queueIndex = MESSAGE_TO_QUEUE(MessageID);

Comment on lines +1515 to +1516
ULONG QueueNum;
ULONG MsgId;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency and clarity, consider using QueueNum and MessageId instead of QueuNum and MsgId. This fixes a typo and avoids abbreviation.

    ULONG QueueNum;
    ULONG MessageId;

@kostyanf14
Copy link
Member

kostyanf14 commented Sep 16, 2025

Current CI status:

Driver Win2022 Win10_22H2 Win11_24H2 Win2025
viostor pass (known fail) Fail 3 test (will rerun) Fail 2 test (will rerun) pass
vioscsi pass (known fail) pass (known fail) pass pass

@YanVugenfirer
Copy link
Collaborator

@kostyanf14
Copy link
Member

rerun tests

1 similar comment
@kostyanf14
Copy link
Member

rerun tests

@kostyanf14
Copy link
Member

Build failed: https://www.dropbox.com/scl/fo/w7uuulyigqeake66rjl7r/AKTtm4EzQZinax52t_Kd71o?dl=0&e=1&lst=&preview=build_log.txt&rlkey=psl13q9685w7xjtvidkijylw2

Windows update prevents loading several DLLs. After installing all pending updates, the build starts

@kostyanf14
Copy link
Member

kostyanf14 commented Sep 25, 2025

Current CI status:

Driver Win2022 Win10_22H2 Win11_24H2 Win2025
viostor pass (known fail) fail (will rerun) fail (will rerun) pass
vioscsi pass (known fail) pass pass pass

@kostyanf14 kostyanf14 merged commit 6971b1c into virtio-win:master Sep 26, 2025
8 of 15 checks passed
@benyamin-codez
Copy link
Contributor

benyamin-codez commented Nov 6, 2025

@vrozenfe

Just saw this one one, Vadim.

Looks like a mix of new fixes and reverting some of my previous work.

Did you see the performance decrease after reintroducing the lock managers?
Perhaps you were primarily concerned with that nasty regression?
Some of the other reversions would only have negligible performance impact, e.g. instantiating vq.

Some of my in-draft work was dependent on migrating ProcessQueue() to ProcessBuffer() to streamline code paths destined for queue processing via a new ProcessQueue() routine, but I accept I might need to explain that better.

Best regards,
Ben

@YanVugenfirer
Copy link
Collaborator

@benyamin-codez I am sorry, but we are going in circles.

Regressions must be first reverted.
We cannot have known regression and go forward with even more changes.

@benyamin-codez
Copy link
Contributor

benyamin-codez commented Nov 6, 2025

@YanVugenfirer

Of course. Just trying to separate fixes from reverts and hoping Vadim had some insights to share.

If performance wasn't considered or tested, then it's likely worth revisiting this, as a solution to the regression was found elsewhere (PR #1293 amongst others).

As you mentioned, this PR resets the code base to a known good condition, and I see from Vadim's OP, that it was "very intensive functional and WHQL [tested]", all of which is great.

P.S. Perhaps my tone was off... I will edit my comment...

@YanVugenfirer
Copy link
Collaborator

This PR got a full test cycle from our QE.

@benyamin-codez
Copy link
Contributor

This PR got a full test cycle from our QE.

Awesome.
Can you share what that includes (or a link)?
Does it, for example, include comparative performance testing?
If so, are the results available?

In any case, I can always raise a PR to merge the changes back in (less the regression) and provide my own comparative performance test results as part of that. I'm just curious what you guys found is all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants