Add FastAPI web backend, monitoring Web UI, and event handler metrics#233
Add FastAPI web backend, monitoring Web UI, and event handler metrics#233NodeJSmith merged 42 commits intomainfrom
Conversation
Detailed architecture plan for adding a web UI backend to hassette, covering DataSyncService (event aggregation), WebApiService (uvicorn runner), REST endpoints, WebSocket real-time updates, configuration, and phased implementation order. https://claude.ai/code/session_014TodsegYLnvZKU5qS2KVM8
- WebApiService replaces HealthService entirely instead of running alongside - /healthz endpoint preserved in FastAPI for Docker healthcheck backwards compat - Config fields renamed: run_health_service → run_web_api, health_service_port → web_api_port - Defaults match old HealthService (run_web_api=True, port=8126) for zero-config migration - Plan now includes HealthService deletion and test migration steps https://claude.ai/code/session_014TodsegYLnvZKU5qS2KVM8
…plan - LogCaptureHandler: logging.Handler subclass that captures into a ring buffer and broadcasts to WS clients. Installed on the hassette root logger so all app + service logs are captured. Frontend can filter by app, service, or log level. - Scheduler global view: get_all_jobs() on SchedulerService aggregates jobs across all apps. New GET /api/scheduler/jobs endpoint. - Job execution metrics: run_job() wrapped with try/finally to record duration, status, and errors into a bounded deque. New GET /api/scheduler/history endpoint. - New WS message types: "log" (opt-in via subscribe) and "job_executed" - New config fields: web_api_log_buffer_size, web_api_job_history_size - New route files: logs.py, scheduler.py - Updated implementation order to include Phase 2 (core infrastructure) https://claude.ai/code/session_014TodsegYLnvZKU5qS2KVM8
There was a problem hiding this comment.
Pull request overview
This PR replaces the legacy HealthService with a managed FastAPI + Uvicorn web backend for observability and app management, and introduces per-event-listener execution metrics plus shared execution timing utilities. It also adjusts the Service lifecycle so serve() task spawning/cancellation occurs between the lifecycle hooks (to avoid subclasses needing to call super()), and adds configurable restart backoff logic to ServiceWatcher.
Changes:
- Add FastAPI REST + WebSocket API (plus
WebApiService+DataSyncService) to expose health, entities, apps, scheduler, bus metrics, logs, config, and WS streaming. - Add execution tracking primitives (
track_execution,ExecutionResult) and aggregate listener/job metrics (ListenerMetrics,JobExecutionRecord), wiring them intoBusServiceandSchedulerService. - Refactor lifecycle handling in
Resource/Serviceand add exponential-backoff restart limiting inServiceWatcher.
Reviewed changes
Copilot reviewed 39 out of 43 changed files in this pull request and generated 20 comments.
Show a summary per file
| File | Description |
|---|---|
| uv.lock | Adds FastAPI/Uvicorn/httpx and transitive deps to the locked environment. |
| pyproject.toml | Declares FastAPI + Uvicorn runtime deps and httpx dev dependency. |
| CLAUDE.md | Adds contributor guidance about avoiding from __future__ import annotations. |
| CHANGELOG.md | Documents new web backend, metrics, lifecycle change, config renames, and removed HealthService. |
| tests/conftest.py | Updates tests to use renamed web API config fields. |
| tests/integration/test_core.py | Updates core wiring assertions for new services replacing HealthService. |
| tests/integration/test_service_watcher.py | Expands integration coverage for restart backoff/attempt limits and regression scenarios. |
| tests/integration/test_web_api.py | Adds integration tests for FastAPI endpoints using httpx ASGI transport. |
| tests/unit/bus/test_metrics.py | Adds unit tests for ListenerMetrics aggregation behavior. |
| tests/unit/core/test_data_sync_service.py | Adds unit tests for DataSyncService aggregation, buffering, logs, and WS client mgmt. |
| tests/unit/resources/test_service_lifecycle.py | Verifies Service serve-task ordering and FinalMeta override behavior. |
| tests/unit/test_execution.py | Adds unit tests for track_execution() and ExecutionResult. |
| src/hassette/bus/metrics.py | Introduces ListenerMetrics mutable aggregate counters + serialization. |
| src/hassette/config/config.py | Replaces health service config with web API config; adds restart-backoff config. |
| src/hassette/config/hassette.dev.toml | Updates defaults to new run_web_api / web_api_port / web_api_log_level. |
| src/hassette/config/hassette.prod.toml | Updates defaults to new run_web_api / web_api_port / web_api_log_level. |
| src/hassette/core/bus_service.py | Adds per-listener metrics tracking around listener invocation and exposes query helpers. |
| src/hassette/core/core.py | Wires in DataSyncService + WebApiService, removes HealthService, updates logging setup. |
| src/hassette/core/data_sync_service.py | New aggregator resource for system state, event/log buffers, bus metrics, and WS broadcast queues. |
| src/hassette/core/health_service.py | Removes legacy aiohttp-based health server implementation. |
| src/hassette/core/scheduler_service.py | Adds job execution history ring buffer + track_execution integration; exposes history/jobs getters. |
| src/hassette/core/service_watcher.py | Adds restart attempt counting + exponential backoff and config-driven limits. |
| src/hassette/core/web_api_service.py | New managed service that runs the FastAPI app via Uvicorn. |
| src/hassette/logging_.py | Adds LogCaptureHandler + LogEntry for ring-buffer log capture and WS broadcast wiring. |
| src/hassette/resources/base.py | Adds _run_hooks/_finalize_shutdown; updates Service to control serve-task placement and allowlist overrides. |
| src/hassette/scheduler/classes.py | Adds JobExecutionRecord dataclass for scheduler execution metrics. |
| src/hassette/utils/execution.py | Adds track_execution() async context manager + ExecutionResult. |
| src/hassette/web/init.py | New web package marker. |
| src/hassette/web/app.py | FastAPI app factory with routers + CORS and API metadata. |
| src/hassette/web/dependencies.py | FastAPI DI helpers to access Hassette, DataSyncService, and Api. |
| src/hassette/web/models.py | Pydantic response models for REST endpoints. |
| src/hassette/web/routes/init.py | Routes package marker. |
| src/hassette/web/routes/apps.py | App status + start/stop/reload endpoints with prod guardrails. |
| src/hassette/web/routes/bus.py | Bus listener metrics and summary endpoints. |
| src/hassette/web/routes/config.py | Sanitized config endpoint (token excluded). |
| src/hassette/web/routes/entities.py | Entity listing/state endpoints. |
| src/hassette/web/routes/events.py | Recent event buffer endpoint. |
| src/hassette/web/routes/health.py | /api/health and backward-compatible /api/healthz. |
| src/hassette/web/routes/logs.py | Log query endpoint backed by LogCaptureHandler. |
| src/hassette/web/routes/scheduler.py | Scheduled jobs and job execution history endpoints. |
| src/hassette/web/routes/services.py | HA services passthrough endpoint via Api. |
| src/hassette/web/routes/ws.py | WebSocket endpoint for streaming events/logs with subscription controls. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Fix four bugs: shutdown deadlock in DataSyncService (use put_nowait), race condition in unregister_ws_client (add async lock), None sentinel crash in WS send loop, and WebApiService shutdown ordering (move should_exit to before_shutdown). Rename /logs route to /logs/recent for consistency. Clean up dual typing imports across all route files and add explanatory comments for metrics retention and WS disconnect handling.
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #233 +/- ##
==========================================
+ Coverage 74.35% 77.46% +3.10%
==========================================
Files 102 125 +23
Lines 7269 8742 +1473
Branches 799 904 +105
==========================================
+ Hits 5405 6772 +1367
- Misses 1508 1605 +97
- Partials 356 365 +9 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 39 out of 43 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Fix potential UnboundLocalError in SchedulerService.run_job by initializing result before the try block. Capture ws_clients count inside the lock in unregister_ws_client. Bump FastAPI minimum version to >=0.128.6 to match lockfile. Tighten CORS allow_methods and allow_headers to explicit lists instead of wildcards. Validate min_log_level from WebSocket clients against known levels.
Server-rendered monitoring dashboard at /ui/ with three pages: - Dashboard: system health, app summary, bus metrics, recent events - Apps: management table with start/stop/reload via HTMX - Logs: filterable viewer with real-time WebSocket streaming Includes Alpine.js WebSocket store for live connection status, HTMX partial endpoints for incremental updates, and run_web_ui config option. Root / redirects to /ui/ when enabled. 30 integration tests, 100% coverage on web UI module.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 57 out of 61 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Fix disabled WebApiService losing ready state by blocking on
shutdown_event instead of returning immediately from serve(). Remove
dead _last_failure_time field from ServiceWatcher. Remove unused LOGGER
from DataSyncService. Fix WS log streaming: correct message type
mismatch ("log_entry" → "log"), add subscribeLogs() method, and send
subscribe on page load and reconnect. Fix XSS in logs page by replacing
innerHTML with textContent for dynamic fields.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 57 out of 61 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ests Run djlint via `uv run` so it resolves from the project venv in CI, and update e2e navigation tests to match the actual "Event Bus" page heading.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 102 out of 115 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
tests/e2e/conftest.py:392
- The
cleanup_state_proxy_fixturefunction has no body after the docstring, which will raise a SyntaxError when the test module is imported. Add apass(or implement the intended no-op behavior) so the fixture is a valid function.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…re collision The pytest-base-url plugin (from pytest-playwright) provides a session-scoped base_url fixture that clashed with the parametrize parameter name, causing ScopeMismatch errors in 8 tests.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 103 out of 116 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Add two new dashboard panels with a redesigned layout: - Scheduled Jobs: counts-only summary (active/repeating/total) via new SchedulerSummaryResponse model and DataSyncService.get_scheduler_summary(), mirroring the pattern of get_bus_metrics_summary() - Recent Logs: full-width scrollable table with sticky headers, Level/Time/App/ Message columns, and View All link - Stacked Scheduled Jobs + Event Bus on the left with Recent Events on the right - Equal-height System Health and Apps panels via flexbox - View All link added to Event Bus panel for consistency - Sticky table headers on both dashboard logs and full logs page - Integration and e2e tests for all new panels and partials
…flake Another test or setup_logging() call earlier in the random test order could install a real LogCaptureHandler with entries, causing the empty assertion to fail. Explicitly patching to None makes the test hermetic.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 106 out of 119 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 107 out of 120 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Add Query(ge=1, le=2000) validation to log_entries_partial limit param - Handle CancelledError explicitly in WebApiService.serve() to avoid logging graceful shutdown as an error - Remove unreachable except Exception in _dispatch_and_log() since run_job() handles its own exceptions - Remove timeout_seconds from ScheduledJob, API model, serializer, UI macro, and test fixtures (never implemented) - Use asyncio.get_running_loop() instead of deprecated get_event_loop() in test helper
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 107 out of 120 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
Replaces the standalone
HealthServicewith a full FastAPI web backend (REST API + WebSocket) and adds a server-rendered monitoring Web UI built with Jinja2, HTMX, Alpine.js, and Bulma CSS. Also introduces per-listener execution metrics for the event bus and job execution history for the scheduler.Web UI (
/ui/)REST API Endpoints
GET /api/health,/api/healthzGET /api/entities[/{id}|/domain/{d}]GET /api/apps[/{key}],GET /api/apps/manifestsPOST /api/apps/{key}/start|stop|reloadGET /api/scheduler/jobs|historyGET /api/bus/listeners,/api/bus/metricsGET /api/events/recent,/api/logs/recentGET /api/services,/api/configGET /api/wsGET /api/docsNew Infrastructure
DataSyncService— aggregates state from StateProxy, AppHandler, SchedulerService, BusServiceWebApiService— Uvicorn ASGI server as a managed background serviceAppRegistry— extracted app state tracking from AppHandlerLogCaptureHandler— ring buffer log capture with WS broadcasttrack_execution()— shared async context manager for timing/error captureListenerMetrics— per-listener aggregate execution countersService Lifecycle Fixes
Servicebase class properly sequencesserve()task: spawns afteron_initialize(), cancels beforeon_shutdown()ServiceWatcherBreaking Config Changes
run_health_service→run_web_apihealth_service_port→web_api_porthealth_service_log_level→web_api_log_levelweb_api_host,web_api_cors_origins,web_api_event_buffer_size,web_api_log_buffer_size,web_api_job_history_size,run_web_uiservice_restart_max_attempts,service_restart_backoff_seconds,service_restart_max_backoff_seconds,service_restart_backoff_multiplierRemoved
HealthService— replaced by FastAPI web backendtimeout_secondsfield fromScheduledJob(never enforced)Test plan
pytest -m e2e)