Skip to content

Conversation

rluna319
Copy link
Contributor

@rluna319 rluna319 commented Jun 6, 2025

Feature: Add multipass wait-ready command

Description:
This PR introduces a new CLI command, multipass wait-ready. This command allows users and scripts to pause execution until the Multipass daemon is fully initialized and ready to accept requests. This is particularly beneficial for automation workflows that need to ensure the daemon is operational before proceeding with other Multipass operations.

The wait-ready command works by:

  1. Initially polling the daemon's gRPC socket until it becomes available.
  2. Once the socket is available, it sends a WaitReady request to the daemon.
  3. The daemon, upon receiving this request, verifies its readiness by attempting to update image manifests (which implicitly checks connectivity to image servers).
  4. The command supports a --timeout option (in seconds). If the daemon isn't ready within the specified timeout, the command exits with an error. If no timeout is given, it waits indefinitely.
  5. An animated spinner and context-relevant message provide user feedback during the waiting period.

Motivation:

  • Improves the reliability of automated scripts and CI/CD pipelines that interact with Multipass by providing a deterministic way to wait for daemon initialization.
  • Addresses scenarios where subsequent Multipass commands might fail if the daemon is still starting up.

Related Issues:
Fixes #452
Fixes #3723

Key Implementation Changes:

  • Client-side (wait_ready.cpp): New command logic, including retry for socket connection, timeout handling (reusing mp::cmd::parse_timeout), and spinner integration.
  • Protocol (multipass.proto): Added WaitReadyRequest / WaitReadyReply messages and WaitReady RPC method.
  • Daemon-side (daemon_rpc.cpp, daemon.cpp):
    • New gRPC service handler in daemon_rpc.cpp.
    • Core readiness check in daemon.cpp leverages the existing wait_update_manifests_all_and_optionally_applied_force() function.

How to Test:

  1. Build this branch and $ cd build.
  2. Ensure the Multipass daemon is stopped.
  3. To verify --timeout, simply run the wait-ready command with the --timeout option:
$ ./bin/multipass wait-ready --timeout 5

Be sure not to start up the daemon first or the command might exit before the timeout.

  • Expected: Info message "Waiting for Multipass daemon to be ready...", spinner is shown, command times out after ~5 seconds with an appropriate error message "Timed out waiting for Multipass daemon to be ready."
  1. (Optional) To replicate the issue that happens when the client tries to use the daemon before its ready (without wait-ready), start up the daemon and immediately use multipass launch:
$ sudo -v
$ (sudo ./bin/multipassd &) && ./bin/multipass launch
  1. Stop the Multipass daemon
$ sudo pkill multipassd
  1. To verify wait-ready command solves the issue, fit the wait-ready command between daemon startup and multipass launch:
$ sudo -v
$ (sudo ./bin/multipassd &) && ./bin/multipass wait-ready && ./bin/multipass launch

Optionally, add --timeout if desired.

  • Expected: Info message "Waiting for Multipass daemon to be ready...", spinner is shown, command exits successfully (ReturnCode::Ok) once the daemon is ready (i.e., can connect to image servers) and the client begins retrieving the default release image for the multipass launch process. You may Ctrl+C here to terminate the launch process.

Points for Reviewers:

  • Confirmation that using wait_update_manifests_all_and_optionally_applied_force(true) is an appropriate and robust proxy for overall daemon readiness.
  • Feedback on the client-side error messages and user experience (spinner, timeout message) is appreciated. The current spinner implementation addresses @ricab's suggestion in Command suggestion: waitready #452.
  • Unit Tests: Per @ricab 's comments in issue Command suggestion: waitready #452, unit tests have not yet been implemented for this initial PR to facilitate earlier review. I am prepared to add them based on feedback.

@ricab
Copy link
Collaborator

ricab commented Jun 6, 2025

Hey @rluna319, thank you for this! We'll review when we get a chance.

@sharder996 sharder996 requested review from xmkg and ricab June 27, 2025 15:16
@ricab ricab requested a review from levkropp June 27, 2025 15:46
@ricab
Copy link
Collaborator

ricab commented Jun 27, 2025

Hey @levkropp, would you mind doing a secondary review of this one? Than I can do a very light "tertiary" review.

levkropp
levkropp previously approved these changes Jun 27, 2025
Copy link
Contributor

@levkropp levkropp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, tested and working with the examples provided. Great additions!

Needs a pass through clang-format-16 to satisfy the linter I believe

Copy link
Member

@xmkg xmkg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, @rluna319! I appreciate the work you've put into this. I was on the verge of implementing a similar feature for the upcoming CLI tests I'm working on, and you can't imagine how happy I was to see this PR :)

Overall, the code looks good! I built and tested your changes locally, and everything works as expected. I have a small request in line, and also, having some unit tests would be nice.

Copy link
Collaborator

@ricab ricab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work @rluna319, especially for a first contribution! I have a number of requests, but they are for polishing and small things. The bulk of the feature looks solid. Just a few tweaks and we should be able to merge.

Many thanks for your involvement and feel free to keep more of these coming 😃

@ricab
Copy link
Collaborator

ricab commented Jul 18, 2025

@rluna319, could you also rebase on the latest main, to see if we can get some CI to run on this? Appreciate it.

@rluna319
Copy link
Contributor Author

@ricab I had not included your requested changes on my latest commit. I had not seen them before I made the push, my fault. I did rebase on the latest main however.

I will work on your requested changes ASAP.

Would you like me to amend your requested changes to my latest commit or have it in a separate commit?

@ricab
Copy link
Collaborator

ricab commented Jul 18, 2025

Hey @rluna319 no worries. We generally prefer additional commits, to make it easy to identify new changes. Thanks!

@rluna319
Copy link
Contributor Author

@ricab I've added all of your requested changes to the recent commit. I also rebased on the latest main as well.

@rluna319 rluna319 requested review from ricab and xmkg July 23, 2025 16:40
Copy link
Collaborator

@ricab ricab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting real close @rluna319! I am going to let CI go through and I think the Lint workflow is going to request some formatting changes. Other than that, I have mostly minor requests. There's only one for real behavior, but I think we can discuss and split off as a separate task if we need it.

@ricab
Copy link
Collaborator

ricab commented Jul 31, 2025

Hey @rluna319, it looks like you haven't signed our CLA yet. Would you mind doing so, please? https://ubuntu.com/legal/contributors

Also, there are a few linting problems. Editors often have options to strip trailing whitespace and keep a single newline at the end. You can check whitespace sanity on a diff with git diff --check .... For example: git diff --check main...wait-ready.

git clang-format might take care of whitespaces too, if you set it up. That might be a good idea anyway because it's going to run once the whitespace check succeeds.

@rluna319
Copy link
Contributor Author

@ricab I signed the CLA, went through linter errors (and other CI build/test errors) and fixed them to the best of my knowledge, ran the code through git clang-format and git diff --check main. I've also rebased off the latest main.

@rluna319
Copy link
Contributor Author

I removed some commented out includes that shouldn't have been in that commit.

Copy link
Collaborator

@ricab ricab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @rluna319, I am good with this in principle, but there seems to be a stray submodule update now and another issue inline.

@xmkg Feel free to follow up on those remaining points and get this in then.

@rluna319
Copy link
Contributor Author

rluna319 commented Aug 1, 2025

I see that I still missed some linter errors. I thought I ran everything through clang-format correctly. I'll address those.

@rluna319
Copy link
Contributor Author

rluna319 commented Aug 7, 2025

@ricab Ive addressed the issues. It says some of the workflows are awaiting approval to be run.

@xmkg
Copy link
Member

xmkg commented Aug 7, 2025

@ricab Ive addressed the issues. It says some of the workflows are awaiting approval to be run.

Approved the workflow.

@rluna319
Copy link
Contributor Author

rluna319 commented Aug 7, 2025

Thank you @xmkg.

I think that the remaining workflows that fail are due to some internal CI issues. Something about a missing token. Is this something I need to fix or can fix on my end?

@xmkg
Copy link
Member

xmkg commented Aug 8, 2025

Thank you @xmkg.

I think that the remaining workflows that fail are due to some internal CI issues. Something about a missing token. Is this something I need to fix or can fix on my end?

Yeah, unfortunately, our CI is not happy with PRs coming from external forks. We're working on making this process smoother, but the necessary machinery is not yet in place. Would you mind if I create a PR for your PR, just for the sake of running the pipeline? We can track the progress and make changes here.

@rluna319
Copy link
Contributor Author

rluna319 commented Aug 8, 2025

Oh ok. Sure thing.

@xmkg
Copy link
Member

xmkg commented Aug 14, 2025

Opened #4304.

@xmkg
Copy link
Member

xmkg commented Aug 14, 2025

@rluna319, we've landed a bunch of CI-related fixes to main. Could you rebase this PR?

This commit introduces the `multipass wait-ready` command. This new
command allows clients to block until the Multipass daemon is fully
initialized and ready to accept requests.

The command first polls the daemon until its gRPC socket is available.
Once the daemon is reachable and receives the `WaitReady` request, it
verifies its readiness by attempting to update image manifests, which
implicitly checks connectivity to image servers.

Users can specify a `--timeout` (in seconds), after which the command
will terminate with an error if the daemon is not yet ready. If no
timeout is specified, the command will wait indefinitely until the
daemon is ready or some other error occurs.

Implementation Details:
- Client-side (`wait_ready.cpp`, `wait_ready.h`):
  - Created the `WaitReady` command logic and its header file.
  - Implemented a retry loop to poll for daemon socket availability.
  - Utilized the existing `mp::cmd::parse_timeout()` function from
    `common_cli.h` to handle the `--timeout` option.
  - Integrated `AnimatedSpinner` for visual feedback during the
    waiting period.
  - Added error handling for various failure conditions (e.g., timeout,
    daemon connection errors).
- Protocol (`multipass.proto`):
  - Defined the `WaitReadyRequest` and `WaitReadyReply` messages and
    the `WaitReady` RPC method within the `Multipass` service.
- Daemon-side:
  - Added the `wait_ready` gRPC service handler function in
    `daemon_rpc.cpp` to process incoming requests.
  - Implemented the core `wait_ready` logic in `daemon.cpp`, which
    leverages the existing
    `wait_update_manifests_all_and_optionally_applied_force()`
    function to check image server connectivity.
- Testing (`tests/mock_client_rpc.h`):
  - Added mock methods for `wait_readyRaw`, `Asyncwait_readyRaw`,
    and `PrepareAsyncwait_readyRaw` to `MockRpcStub` to support
    testing of the new RPC endpoint.
This commit introduces unit test for the `wait-ready` command along with improvements
to its client-side and daemon-side command logic.

The daemon now avoids forcing manifest downloads during `wait-ready`.

The daemon's `wait-ready` gRPC handler now uses the correct gRPC status code when
there is an issue verifying connection to the image servers.

Unit tests for `wait-ready` have been added.

Implementation Details:
- Daemon (`daemon.cpp::wait_ready()`):
	- Set `force_manifest_network_download` to `false` to prevent forcing
	  unnecessary manifest downloads while verifing imager server connection.
	- Change `grpc::StatusCode` from `UNAVAILABLE` to `FAILED PRECONDITION`
	  to avoid client retry loop from continuing when there is an exception
	  thrown during the `update_manifests_all_task`.
- Client (`wait_ready.cpp`):
	- Remove redundant cerr output.
	- Add logic to stop `timer` and `spinner` if the `--timeout` option is
	  enabled.
- Unit Testing:
	- Added `test_daemon_wait_ready.cpp` test suite to verify daemon-side
	  `wait-ready` logic.
	- Added `wait-ready` cli tests to `test_cli_client.cpp` to verify client-
	  side command parsing and failure handling.
	- Added `wait-ready` to the existing timeout and client logging test suites.
Refactors how the `--timeout` option is added to commands to improve
code reuse and provide more context-specific help text.

A private helper function, `add_timeout_option`, is introduced in an
anonymous namespace to handle the core logic of creating the option.
This is now used by two public functions:
- `add_timeout`: Provides a generic description, now used by `wait-ready`.
- `add_instance_timeout`(NEW): Provides a more detailed description for
  commands that operate on instances.

The `wait-ready` command implementation has also been improved:
- The `spinner` and `timer` are now local variables within the `run()`
  method to minimize their scope.
- Associated headers (`chrono`, `thread`, `timer.h`, `animated_spinner.h`)
  have been moved from the header into `wait_ready.cpp`.
The daemon will now catch any DownloadExceptions and set the grpc status to
NOT_FOUND with a specific error message to let the client know that the daemon
was not able to reach the image servers. The client will retry (send another
dispatch) on this error status until the `--timeout` specified is reached or
the daemon successfully verifies connection with the image servers.

Implementation Details:
- Daemon (`daemon.cpp::wait_ready()`)
	- When a `DownloadException` is caught, the status code is set to
	  `NOT_FOUND`, instead of `FAILED_PRECONDITION`, with a specific
	  error message.

- Client (`wait_ready.cpp`)
	- The `on_failure` lambda now checks for the specific error message
	  alongside the `NOT_FOUND` status code.
		- This allows for differentiation between `NOT_FOUND` being
		  returned by the dispatcher when the daemon's socket is
		  unavailable, and `NOT_FOUND` being returned by the daemon
		  when it catches a `DownloadException`.
	- Removed unused `this` capture from `on_success` lambda to satisfy
	  macOS build error.

- Testing (`test_daemon_wait_ready.cpp`)
	- Refactored the test case where a `DownloadException` is thrown by
	  `update_mainfests()` to allow additional calls due to the new
	  client retry logic.
	- Added EXPECT_CALL for mock_settings.get(winterm_key) to address
	  uninetersting mock call error for windows platform tests.

- Testing (`test_cli_client.cpp`)
	- Added a test case to check proper handling of the new grpc status
	  set by the daemon when it catches a `DownloadException`.

Other:
- Fixed linter errors
- Addressed improper log levels being used
- Fixed formating issues
- Removed unnecessary includes

format and fix exception

format with clang-format

remove commented out includes
Details:
- Remove checking of grpc status error strings in the `on_failure` lambda in `wait_ready.cpp`.
- Resolve missed linter errors.
- Resolve 3rd party jsoncpp submodule pointing to wrong version.
@rluna319
Copy link
Contributor Author

@xmkg I have rebased the PR. Workflows awaiting approval.

@xmkg
Copy link
Member

xmkg commented Aug 15, 2025

@xmkg I have rebased the PR. Workflows awaiting approval.

Great, thanks. Synced #4304.

@rluna319
Copy link
Contributor Author

rluna319 commented Aug 15, 2025

@xmkg Great thank you. I see all the CI tests have passed in #4304. Should I mind the coverage report? I've looked at it already and I can cover the line in wait_ready.cpp that it says isn't covered.

However, i'm having trouble covering the other one in daemon.cpp where we catch any other std::exception. Throwing a runtime error in update_manifests just causes the test to hang. I'm guessing its the way that exception is or isn't handled in the asynchronous task. I'm not sure what other exception would make it to that catch block. Any other exceptions that ive traced within the codepath are already handled locally. Perhaps i'm just missing something i'm not aware of.

@xmkg
Copy link
Member

xmkg commented Aug 18, 2025

@xmkg Great thank you. I see all the CI tests have passed in #4304. Should I mind the coverage report? I've looked at it already and I can cover the line in wait_ready.cpp that it says isn't covered.

That would be nice.

However, i'm having trouble covering the other one in daemon.cpp where we catch any other std::exception. Throwing a runtime error in update_manifests just causes the test to hang. I'm guessing its the way that exception is or isn't handled in the asynchronous task.

I'd say don't worry about it. There's a way to trigger that exception catcher... but it's a bit cheeky, though. We could use a logger that throws an exception on .log(). That would make the coverage tool happy, but I'm okay with it as-is, too.

- added a new test in `test_cli_client.cpp` which covers the conditional
`if (timer)` in the `on_failure` lambda.
@rluna319
Copy link
Contributor Author

@xmkg I added the coverage for wait_ready.cpp.

I tried to implement your suggestion, throwing and exception on log(), but wasn't able to do it. I kept getting the error that the Logger has no member named "gmock_log". I thought maybe I would have to add a mock method for log() in mock_logger.cpp but I felt that would be overkill if coverage for that line wasn't crucial. I tried doing a local mock method for the logger as well but I couldn't get that to work either. I may just lack the experience to implement this properly.

Copy link
Member

@xmkg xmkg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks @rluna319!

Copy link
Collaborator

@ricab ricab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did a final pass and it's looking great. Thanks again!

@github-merge-queue github-merge-queue bot closed this pull request by merging all changes into canonical:main in c50c359 Aug 19, 2025
@rluna319
Copy link
Contributor Author

Awesome! Thank you for allowing me to contribute @xmkg @ricab. A great experience, I learned a lot. Looking forward to the next!

@ricab
Copy link
Collaborator

ricab commented Aug 21, 2025

Our pleasure @rluna319! This was a great contribution, very well pondered. Feel free to send more in!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants