feat: Add `StatusMessageWatcher` #407

Pijukatel · 2025-05-20T09:53:58Z

Description

Add the option to log status and status messages of another Actor run. If such Actor run is also redirecting logs from another Actor run, then those logs can propagate all the way to the top through the StreamedLog - deep redirection. If the Actor run from which we are redirecting status messages is not redirecting logs from its children, then the log and status redirection remains shallow - only from that Actor run.

Example usage:

Through `ActorClient(Async).call`

By default it is turned on, so just calling actor like this already redirects the logs to pre-configured logger:

actor_client = ApifyClientAsync(token='token').actor(actor_id='actor_id')
await actor_client.call()

To turn off, pass logger=None

...
await actor_client.call(logger=None)

Or pass custom logger, if desired:

...
await actor_client.call(logger=some_other_logger)

Through `RunClient(Async).get_status_message_watcher`

With context:

run_client=ApifyClientAsync(token='token').run(run_id='run_id')
status_message_watcher = await run_client.get_status_message_watcher()

async with status_message_watcher:
    # Do stuff while the status from the other actor is being redirected to the logs. Leaving the context stops the redirection.
    ...

Manually:

...
status_message_watcher.start()
# Do stuff while the status from the other actor is being redirected to the logs.
await status_message_watcher.stop()

Issues

Closes: Redirect status messages from other actors #404

TODO: Message inspection to split bulk messages TODO: Log level guessing outside of filter

Update test to be able to use caplog

Refactor tests and code to reuse code.

Update default logger to not duplicate handlers if already exists

Have explicit start and stop instead of __call__

…byte character

Update test data to properly match the reality with line endings

Naming.

Add final timeout.

Pijukatel · 2025-05-20T11:01:28Z

Example log of recursive Actor run with highlighted redirected status messages

Example Actor:
https://console.apify.com/actors/3QIkuY6bClKBxC9wK

…sages

Copilot

Pull Request Overview

Adds the new StatusMessageRedirector utility to stream status and status messages from a target Actor run (both shallow and deep redirection), integrates it into the client APIs, and updates tests to cover the new behavior.

Introduce StatusMessageRedirector (sync & async) in log.py
Add get_status_message_redirector to RunClient (sync & async) in run.py
Hook status redirection into ActorClient.call (sync & async) in actor.py
Update and extend test_logging.py to exercise status message streaming

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File	Description
tests/unit/test_logging.py	Add fixtures and new tests for `StatusMessageRedirector`
src/apify_client/clients/resource_clients/run.py	New `get_status_message_redirector` methods for sync and async runs
src/apify_client/clients/resource_clients/log.py	Core `StatusMessageRedirector`, `StatusMessageRedirectorSync/Async`
src/apify_client/clients/resource_clients/actor.py	Wire up status redirector into `ActorClient.call` sync & async variants

Comments suppressed due to low confidence (5)

tests/unit/test_logging.py:386

[nitpick] We should also add a test for the manual start()/stop() usage of StatusMessageRedirector (outside of a context manager) to ensure that flow is covered.

@respx.mock

src/apify_client/clients/resource_clients/run.py:349

create_redirect_logger is used here but not imported; add from apify_client._logging import create_redirect_logger at the top.

to_logger = create_redirect_logger(f'apify.{name}')

src/apify_client/clients/resource_clients/log.py:455

asyncio.create_task is called but asyncio is not imported; add import asyncio.

self._logging_task = asyncio.create_task(self._log_changed_status_message())

tests/unit/test_logging.py:168

This fixture mutates a class‐level variable without resetting it, which can leak state across tests; consider using yield and restoring the original value after the test.

StatusMessageRedirector._force_propagate = True

src/apify_client/clients/resource_clients/run.py:337

Docstring says it returns StatusMessageRedirector, but in sync/async methods it actually returns StatusMessageRedirectorSync/StatusMessageRedirectorAsync; clarify the concrete return type.

Returns:
            `StatusMessageRedirector` instance for redirected logs.

janbuchar · 2025-05-26T11:53:39Z

Just a question - are we planning to also show the status message of the callee as the status message of the caller? Or display something like "Waiting for ... to finish"?

janbuchar · 2025-05-26T12:27:19Z

src/apify_client/clients/resource_clients/run.py

+        actor_name = actor_data.get('name', '') if run_data else ''
+
+        if not to_logger:
+            name = '-'.join(part for part in (actor_name, run_id) if part)


Are you sure it's a good idea to include the run id by default? The user won't know what it is and it makes the logline kinda messy, especially with nested invocation.

Actor name is for readability and run id is for uniqueness and ability to track the log to it's origin. Keeping only the actor name would make it harder to find the origin of the log.
Depending on the use-case this might seem less or more relevant. Maybe it could be something like f"{actor_name}-runId:{run_id}" to make it more explicit and clear what the id refers to.

And in which case do you actually need to track the log to its origin? Isn't the most frequent case calling a single Actor and showing the status so that there's some activity in the log?

Also, a - may not be an ideal separator, those appear in Actor names quite often.

Any scenario where one actor is calling two different runs of another actor. There it would be crucial, but it is not a common use case.

In normal "one-to-one" scenario it is useful to easily navigate to the relevant run of the called actor. That I do not think is uncommon scenario. Is there any other convenient way how to navigate to the called actor run that would make this information redundant in the logs?

About separator I am fine with anything. In logger it can be even white space I guess. Do you have any preference there?

In case 1, it would be nice if the user could opt into logging the run id.

In case 2, I'd prefer to just log the run id when starting the Actor. But don't take it as an authoritative answer, I'm just concerned about printing the run id a zillion times on each line.

As a separator, a space sounds like a good idea. Maybe with some arrow or >> for nested Actor runs?

The reason I do not want to add any arguments for customization of the default logger is that it will be part of functions that already have many arguments, for example actor.call . So my idea was, either use default without any customization options, or create your own logger and nothing in between. This gives you all the freedom and does not add too many new arguments to functions that already have too many of them.

What about following idea to reduce the run id spam in the logs. Each redirected logger will first print some full identification message, like: "Redirect logger for actor XYZ and run id ABCDEF... will be shown as {some alias}"

Then we can talk about this alias. Which can be for example "apify.{actor_name} {redirect_logger_counter}" or "apify.redirect {n}" or something else ...?

So it will be shorter and you will still be able to uniquely identify the origin of the redirected log with the usage of the first log message.

BUT!!! Going on with such approach you will not be able to identify deeply nested redirected loggers if you choose an option to not redirect from the start of the actor run (in case of long running standby actors) as you will miss the first redirected logger identification message.

Having that in mind, I still feel that full info in each message is the safest for the default logger, even though it is quite verbose. I will add documentation once it is also integrated into SDK that will show how easy it is to create logger that allows the customization.

janbuchar · 2025-05-26T12:40:55Z

src/apify_client/clients/resource_clients/log.py

+            check_period: The period with which the status message will be polled.
+        """
+        self._to_logger = to_logger
+        self._to_logger.propagate = self._force_propagate


It's not nice to silently reconfigure a logger passed in by an unsuspecting client 🙂

Good point. It should have been reconfigured only if the internal _force_propagate is changed to True, which is only the test use-case for being able to use caplog.

src/apify_client/clients/resource_clients/log.py

Pijukatel · 2025-05-28T10:20:01Z

Just a question - are we planning to also show the status message of the callee as the status message of the caller? Or display something like "Waiting for ... to finish"?

Currently this change is only about logging callee stuff. The status message of the caller is already available in the actor run. So if someone wants to duplicate it into logs then I think they can do it by manually both setting the status message and logging the status message, but I would not do it by default for now.

vdusek · 2025-06-12T11:50:58Z

src/apify_client/clients/resource_clients/log.py

+    def start(self) -> Task:
+        """Start the logging task. The caller has to handle any cleanup by manually calling the `stop` method."""
+        if self._logging_task:
+            raise RuntimeError('Logging task already active')
+        self._logging_task = asyncio.create_task(self._log_changed_status_message())
+        return self._logging_task
+
+    def stop(self) -> None:
+        """Stop the logging task."""
+        if not self._logging_task:
+            raise RuntimeError('Logging task is not active')
+
+        self._logging_task.cancel()
+        self._logging_task = None
+
+    async def __aenter__(self) -> Self:
+        """Start the logging task within the context. Exiting the context will cancel the logging task."""
+        self.start()
+        return self
+
+    async def __aexit__(
+        self, exc_type: type[BaseException] | None, exc_val: BaseException | None, exc_tb: TracebackType | None
+    ) -> None:
+        """Cancel the logging task."""
+        await asyncio.sleep(self._final_sleep_time_s)
+        self.stop()


Why not to remove start/stop and implement __aenter__ / __aexit__ directly?

__aenter__ / __aexit__ are there as well for convenience, but start and stop is exposed for flexibility to make it possible for the users to call these methods outside of sometimes limiting context manager and without the need to call double-underscored methods directly.

Personally, I'm not a fan of exposing additional start/stop methods just to provide an alternative way to manually control what the context manager takes care of. If a user really wants to call enter / exit methods directly, I think he can just do it directly, rather than introducing extra methods that just duplicate that logic. Not gonna block the merge though, up to you.

janbuchar

Mostly LGTM, some random nits

janbuchar · 2025-06-12T13:05:46Z

tests/unit/test_logging.py

-    actor_runs_responses = iter(
-        (
-            httpx.Response(
+    def create_status_responses_generator() -> Iterator[httpx.Response]:


nit - I think you can omit the create_ prefix here and make the function name a bit less java-esque 😁

janbuchar · 2025-06-12T14:48:12Z

src/apify_client/clients/resource_clients/log.py

@@ -378,3 +382,160 @@ async def _stream_log(self) -> None:

            # If the stream is finished, then the last part will be also processed.
            self._log_buffer_content(include_last_part=True)
+
+
+class StatusMessageWatcher:


Should these new classes be exposed publicly? They seem like implementation details to me - you usually create these using helper methods on the resource client, right?

Helper methods on clients are convenient constructors for these classes, but the user will interact with them directly calling either start, close or using them as context managers.

(From ActorClient point of view this is indeed implementation detail hidden in the call method, but from RunClient point of view it is actual public return value of one of the public method.)

janbuchar · 2025-06-12T15:04:27Z

src/apify_client/clients/resource_clients/log.py

+        if not self._logging_task:
+            raise RuntimeError('Logging task is not active')
+
+        self._logging_task.cancel()


I'm afraid there might be GC-related warnings if you don't await the task (docs)

Ok, added awaits

vdusek

LGTM

Pijukatel added 22 commits May 12, 2025 15:44

TODO: Figure out hwo to mock response with steram

a71ae41

WIP

69ff84c

TODO: Message inspection to split bulk messages TODO: Log level guessing outside of filter

Polish spliting of messages and setting the log level

862cacc

Update test to be able to use caplog

Draft with async implementation and example tests

753427a

Add raw=True

cc0d944

Add chunck processing

cbcabd3

Merge remote-tracking branch 'origin/master' into redirected-actor-logs

81577e8

Add sync version of the logging.

b9bc44d

Refactor tests and code to reuse code.

Finalize, update comments

9720327

Add from_start argument for streaming from stand-by actors

85ead2f

Update default logger to not duplicate handlers if already exists

Skip first logs based on datetime of the marker

4ad39fa

Self review.

74595f9

Have explicit start and stop instead of __call__

Handle bytestream edgecase of chunk containing only half of the multi…

cba571f

…byte character

Review comments

02a1eb2

Remove unnecessary actor_name argument

2674cf2

Update split pattern to deal with multiple times redirected log

2a6f2ec

Update test data to properly match the reality with line endings

Review comment

1263450

Regenerate uv.lock with new version of uv

b1338f1

Test data time alignment.

669a749

Naming.

Add status redirector

737cde9

TODO: Finalize tests

2914e50

Finalize tests.

8fbbffa

Add final timeout.

github-actions bot assigned Pijukatel May 20, 2025

github-actions bot added this to the 115th sprint - Tooling team milestone May 20, 2025

github-actions bot added t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics. labels May 20, 2025

Pijukatel added 3 commits May 20, 2025 16:37

Merge remote-tracking branch 'origin/master' into redirect-status-mes…

8e70e59

…sages

Update syntax to avoid PyCQA/redbaron#212

a3a629e

Update client names in tests to match their type

18f4f51

Pijukatel force-pushed the redirect-status-messages branch from e903a68 to 18f4f51 Compare May 21, 2025 07:51

Pijukatel requested a review from Copilot May 21, 2025 07:54

Copilot AI reviewed May 21, 2025

View reviewed changes

Pijukatel marked this pull request as ready for review May 21, 2025 08:09

Pijukatel requested review from vdusek and janbuchar May 21, 2025 08:10

janbuchar reviewed May 26, 2025

View reviewed changes

Review comments

268e568

Properly set _force_propagate

335b8c3

Pijukatel changed the title ~~feat: Add StatusMessageRedirector~~ feat: Add StatusMessageWatcher May 28, 2025

Use whitespace in default redirect logger name instead of -

1e5e976

Pijukatel requested a review from janbuchar June 11, 2025 08:59

vdusek reviewed Jun 12, 2025

View reviewed changes

janbuchar reviewed Jun 12, 2025

View reviewed changes

Review comments

350fc67

Pijukatel force-pushed the redirect-status-messages branch from 842777c to 350fc67 Compare June 13, 2025 09:59

Pijukatel requested review from janbuchar and vdusek June 13, 2025 11:06

janbuchar approved these changes Jun 13, 2025

View reviewed changes

vdusek approved these changes Jun 13, 2025

View reviewed changes

Pijukatel merged commit a535512 into master Jun 13, 2025
51 of 52 checks passed

Pijukatel deleted the redirect-status-messages branch June 13, 2025 11:25

feat: Add StatusMessageWatcher #407

feat: Add StatusMessageWatcher #407

Uh oh!

Conversation

Pijukatel commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Example usage:

Through ActorClient(Async).call

Through RunClient(Async).get_status_message_watcher

Issues

Uh oh!

Pijukatel commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

janbuchar commented May 26, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Pijukatel commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

janbuchar left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vdusek left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

feat: Add `StatusMessageWatcher` #407

feat: Add `StatusMessageWatcher` #407

Pijukatel commented May 20, 2025 •

edited

Loading

Through `ActorClient(Async).call`

Through `RunClient(Async).get_status_message_watcher`

Pijukatel commented May 20, 2025 •

edited

Loading

Pijukatel commented May 28, 2025 •

edited

Loading