How does the data model compute active / category time #1169

petrroll · 2025-08-12T19:55:03Z

petrroll
Aug 12, 2025

The data model is roughly hierarchical. There's hostnames (devices), each has sets of buckets, and each bucket has events.

Some event types are special cased, s.a. active window or (afk) status.

I'm not sure how it's all agregated tho, especially if there are multiple buckets of the same event type, possibly overlapping. And either on the same hostname or on separate host names.

I understand each event can be categorized (mainly active app window but ?also browser site?). And also that both browser watcher as well as afk watcher can emit afk status (browserr when playing sound > not afk).

With this knowledge I assume that we create a mask of activity where essentialy every status not afk is logic 1 or we or across all buckets of all all devices.

This is the overall mask for activity.

Then we also create this mask per device, this is activity mask per device.

Then we gather all categorizable events per device (these might be of multiple buckets, so they can overlap). We categorize each. And for each timeslot we select the ???highest prio one, which is the deepest category???. And then do the same aggregation across devices ????

Is that roughly true?

I'm asking for two reasons:

I want to know stuff. And I think this should be documented ideally in documentation.
I'm trying to do following:

have additional watcher that gives me name of current virtual desktop (Windows). I already coded it (nice DX!). Right now I have rules that any visual studio is work, but I also need to somehow encode that if and I'm on VD with name of Personal, it's personal category. (It might not be possible in current model, but I need to understand current model to propose how to extend it)
I want to add a watcher that queries activity status in Microsoft graph. Mainly watching for being in meeting (as I'm frequently in meeting on phone, and I want it tracked, but my laptop during those times is frequently running but afk). And if I'm in meeting I want it to make me not afk and also push an activity that'll be categorized as work.

ErikBjare · 2025-08-14T09:20:05Z

ErikBjare
Aug 14, 2025
Maintainer

The code you are looking for is the query constructed by the frontend to process the raw bucket events into a set of events ("canonical events") and then categorize, group/merge-by-key, get top apps/titles/URLs etc. This is referred to as the "canonical query":

Readable snapshot of the query used by the web UI: https://github.com/ActivityWatch/aw-webui/blob/master/test/unit/__snapshots__/queries.test.node.js.snap
Implementation: https://github.com/ActivityWatch/aw-webui/blob/master/src/queries.ts
Python implementation of "canonical query": https://github.com/ActivityWatch/aw-client/blob/master/aw_client/queries.py

How the multi-device canonical query works can be seen here: https://github.com/ActivityWatch/aw-webui/blob/f38b119f0ad628dd8af6ddfaaaf89700b9290c06/src/queries.ts#L164-L184 (it works by order-of-host priority, for now)

Your understanding seems correct to me. In the interest of saving time answering your final two questions, below is a gptme generated answer (Claude 4 Sonnet) given the above links as context (it's a good answer):

Looking at the ActivityWatch query code, I can explain how the data model computes active/category time. Your understanding is mostly correct! Let me break it down:

How Activity/Category Time is Computed

1. Activity Mask Creation (AFK Detection)

The system creates an overall "not-afk" mask by combining multiple sources:

// From canonicalEvents function
not_afk = flood(query_bucket(find_bucket("aw-watcher-afk_testhost")));
not_afk = filter_keyvals(not_afk, "status", ["not-afk"]);

// Additional sources can contribute to "not-afk":
// - Browser audible events
audible_events = filter_keyvals(browser_events, "audible", [true]);
not_afk = period_union(not_afk, audible_events);

// - Always-active patterns (apps that should never be considered AFK)
not_treat_as_afk = filter_keyvals_regex(events, "app", "pattern");
not_afk = period_union(not_afk, not_treat_as_afk);

2. Event Filtering by Activity

Window/app events are filtered to only include active periods:

events = filter_period_intersect(events, not_afk);

3. Categorization

Events are categorized using rules, adding a $category field:

events = categorize(events, categories);

4. Multi-Device Handling

For multiple devices, union_no_overlap gives priority to the first device in the list:

events = union_no_overlap(events, events_device1);
events = union_no_overlap(events, events_device2);

Answering Your Specific Questions

Virtual Desktop Categorization

For your virtual desktop watcher, you have a few options:

Option 1: Extend categorization rules
You could modify the categorization system to consider virtual desktop context. The current categorization happens in the categorize() function, which could be extended to look at additional event properties.

Option 2: Create a hybrid approach

# Your virtual desktop events could influence categorization
vd_events = query_bucket("aw-watcher-virtual-desktop_hostname");
personal_vd = filter_keyvals(vd_events, "name", ["Personal"]);

# Then modify window events based on VD context
events_on_personal = filter_period_intersect(events, personal_vd);
# Apply different categorization rules to these events

Microsoft Graph Meeting Integration

For meeting detection, you could:

Option 1: Extend the AFK detection

# Add meeting events as a source of "not-afk"
meeting_events = query_bucket("aw-watcher-msgraph_hostname");
in_meeting = filter_keyvals(meeting_events, "status", ["in-meeting"]);
not_afk = period_union(not_afk, in_meeting);

# Also inject meeting events as categorizable activities
meeting_activity = transform_meetings_to_activity_events(meeting_events);
events = union_no_overlap(events, meeting_activity);

Recommendations

Documentation: You're right that this should be better documented. The query logic is complex and understanding it is crucial for extensions.
Your Extensions: Both your use cases are feasible:
- Virtual desktop context can be added to categorization rules
- Meeting status can both override AFK detection and inject activity events
Implementation Path:
- Start by creating your watchers (which you've already done for VD)
- Extend the query logic in canonicalEvents() to incorporate your new data sources
- Test with custom queries before proposing changes to the main codebase

The current model is quite flexible - it's designed around the concept of "events with periods" that can be filtered, merged, and categorized. Your additions fit well into this paradigm.

1 reply

petrroll Sep 27, 2025
Author

Thank you Eric. But if I read the answer correctly it cannot be done without extending the UI frontend code. And of that's necessary, then I wonder if it might not be worth it to try to design some more flexible system.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ActivityWatch

How does the data model compute active / category time #1169

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

ActivityWatch

How does the data model compute active / category time #1169

Uh oh!

petrroll Aug 12, 2025

Replies: 1 comment · 1 reply

Uh oh!

Uh oh!

ErikBjare Aug 14, 2025 Maintainer

How Activity/Category Time is Computed

1. Activity Mask Creation (AFK Detection)

2. Event Filtering by Activity

3. Categorization

4. Multi-Device Handling

Answering Your Specific Questions

Virtual Desktop Categorization

Microsoft Graph Meeting Integration

Recommendations

Uh oh!

petrroll Sep 27, 2025 Author

petrroll
Aug 12, 2025

Replies: 1 comment 1 reply

ErikBjare
Aug 14, 2025
Maintainer

petrroll Sep 27, 2025
Author