-
Notifications
You must be signed in to change notification settings - Fork 9
munet: add Commander.async_spawn() #58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
munet: add Commander.async_spawn() #58
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #58 +/- ##
==========================================
+ Coverage 58.84% 59.28% +0.44%
==========================================
Files 19 19
Lines 5729 5802 +73
==========================================
+ Hits 3371 3440 +69
- Misses 2358 2362 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
As discussed elsewhere, we need to use a new Also, since async/await only fails with non-pty pexpect.popen_spawn (i.e., pipes) you should restrict the busy loop fix to only that case. For use_pty == True use the supported async from pexpect. |
spawn(), which conducts a send/expect process is blocking due to the abscence of async_=True to pexpect.expect() within the method's main send/expect loop. Some async methods call spawn(), however, which leads to such methods being blocked. This is particularly problematic in cases where multiple VMs requiring custom console expect/send prompts are required (e.g. routers). Since each QEMU VM sets up both console logging and completes an expect/send loop without yielding control to the coroutine of the other QEMU VMs, only one VM can ever have logging configured (and thus advance in an expect/send loop) at any given time. Not only is this inefficient given that all QEMU VMs are running and waiting for input, but this can be fatal if console logging is not set up on each VM in time. If some mandatory console output is missed, then there is no way to recover and the setup of the QEMU VM will time out. However, setting async_=True in pexpect.expect() leads to errors when mixed with PopenSpawn. To bypass this issue, we do not use pexpect.expect() with async_=True, but repeatedly call pexpect.expect() with a timeout set at 0.1 (Until a timeout that we track expires). The end result of this change is that both logging is set up on all QEMU nodes immediately and that independent progress can be made in each VM's expect/send loop simultaneously. Signed-off-by: Liam Brady <[email protected]>
e3b5d2c
to
a848814
Compare
I modified both console/shell_spawn and launch/monitor to use the new async variant (both methods are already async and so should use the variant), and now spawn() appears to have fallen out of the code coverage. It appears that commander.spawn() is not called anywhere else within the codebase. Does this affect the decision to continue to maintain commander.spawn()? Do we need it for future additions to the munet API? It doesn't feel correct to leave dead code within the project (or write new tests to maintain a section of code that isn't being meaningfully used.) |
@@ -727,6 +727,151 @@ def spawn( | |||
p.close() | |||
raise error from eoferr | |||
|
|||
async def async_spawn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to create this new function you did a large copy and paste. That's a good indication we need to refactor the function so that more code is shared. :)
That is hard to do sometimes with async/await. I feel like there should be some way to share code better between asyncio and non-asyncio, but I haven't figured it out yet. This is why you see other places in the code where I run the async function inside an asyncio.run()
call for the non-asyncio variant. I'm not sure how safe this is though b/c sometimes people aren't using asyncio b/c it's broken for them in which case running it with asyncio.run()
will be broken too.
A good place to test this would be in an FRR topotest which uses spawn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite like the asyncio.run()
solution for two reasons:
- The synchronous version of
spawn()
cannot be used from within an asynchronous method since asyncio does not support nested event loops to the best of my knowledge. I am not confident enough to claim that there will never be a use case wherespawn()
inside an async method is appropriate (especially sincespawn()
is located so deep within the API). As such, anasyncio.run()
solution does not feel safe. - Even if we followed the
asyncio.run()
solution, then we have to work around the async anduse_pty=False
hack (i.e. to avoid the exception withp.expect(..., async_=True)
and pipes). Specifically, we only want to use the hack when absolutely necessary. However, usingasyncio.run()
as the solution would imply that we end up using the hack during the synchronous variant ofspawn()
, which is not desired.
The only other real option then in refactoring is to try and split off as much synchronous code as possible and place it into helper methods. I don't find such a solution satisfying, since that would leave a large portion of the methods untouched, but don't see any other real alternatives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the latest commit for a minor refactor that collects code that is synchronous in either version (sync/async) into a single method to reduce duplicate code.
I am struggling to determine if there are any other changes that I could make to reduce code duplication. The presence of the await statement within the try/except block + while block make it difficult to reduce the logic any further imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to test both async and non-async variants, and the vty and non-vty case.
tests: cover error cases, both sync/async This commit introduces a minor refactor that collects sync code used by both sync/async spawn() into a single callable method. The majority of sync/async spawn() is untouched due to the fundamental difference that asyncio brings with it to the send/expect loop. New tests are introduced such that both sync/async spawn() methods are tested separately. Movever, the error cases are also now tested. Signed-off-by: Liam Brady <[email protected]>
sends[index - 1], | ||
) | ||
if sends[index - 1]: | ||
p.send(sends[index - 1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The majority of synchronous spawn() is tested, but not all of the duplicate code. I was unable to determine a solution that allowed the send/expect logic to be tested while ensuring that we also test functionality across all the different shells.
spawn(), which conducts a send/expect process is
blocking due to the abscence of async_=True to
pexpect.expect() within the method's main
send/expect loop. Some async methods call
spawn(), however, which leads to such methods
being blocked.
This is particularly problematic in cases where
multiple VMs requiring custom console expect/send
prompts are required (e.g. routers). Since each
QEMU VM sets up both console logging and completes
an expect/send loop without yielding control to
the coroutine of the other QEMU VMs, only one VM
can ever have logging configured (and thus advance in
an expect/send loop) at any given time. Not
only is this inefficient given that all QEMU VMs
are running and waiting for input, but this can
be fatal if console logging is not set up on each
VM in time. If some mandatory console output is
missed, then there is no way to recover and the
setup of the QEMU VM will time out.
However, setting async_=True in pexpect.expect()
leads to errors when mixed with PopenSpawn.
To bypass this issue, we do not use pexpect.expect()
with async_=True, but repeatedly call
pexpect.expect() with a timeout set at 0.1 (Until
a timeout that we track expires).
The end result of this change is that both logging is set
up on all QEMU nodes immediately and that
independent progress can be made in each VM's
expect/send loop simultaneously.