[WIP] ML energy consumption predictor (experimental, v8.1.1)#69
Draft
pookey wants to merge 2 commits into
Draft
Conversation
52bd646 to
b2d8577
Compare
Add ml_prediction consumption strategy using XGBoost to generate 96 quarter-hourly energy consumption forecasts. The model trains on InfluxDB historical data with weather forecast features from Home Assistant and retrains daily at 23:00. Includes: - ML module (ml/) with trainer, predictor, data_fetcher, config - ML Report dashboard page showing model metrics, feature importance, and forecast comparison - /api/ml-report endpoint - Docker base image switch to Debian Bookworm for xgboost/sklearn - Daily retrain scheduler in app.py The ML predictor is experimental — the influxdb_7d_avg strategy is recommended for production use. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The ML Report showed a rolling 24h window starting from the next quarter
hour instead of a calendar-day forecast, and "yesterday" on the chart was
actually today's partial data.
Root causes: _build_future_timestamps was anchored to datetime.now(), and
the BSM cache was keyed by generation day rather than target day, so today
and tomorrow could never coexist. Also fixes a dormant bug where
_generate_ml_predictions imported a function name that never existed
(predict vs predict_next_24h), so the ML path had never populated the
cache on this branch.
- Predictor: _build_future_timestamps now requires target_date and anchors
at local midnight; predict_next_24h and predict_with_timestamps thread
target_date into fetch_history_context. Heavy deps (xgboost,
feature_engineer) moved to lazy imports so tests collect without them.
- BSM: _ml_forecast_cache is now dict[date, list[float]]; stale entries
evict on access; _retrain_ml_model wipes and repopulates {today,
tomorrow}; short predictions (when HA weather can't reach into the past
of today) are front-padded so the cached vector stays calendar-aligned.
- _get_consumption_forecast now requires target_date; _gather_optimization_data
passes today or tomorrow based on prepare_next_day.
- /api/ml-report: prefers today when cached, falls back to newest;
yesterday/week-avg/today-actuals are always anchored to the real calendar
today; when showing today's forecast the already-elapsed quarters are
masked out so the ML Predicted line only draws from "now" forward and the
new red "today so far" trace fills the past.
- ml/data_fetcher: add fetch_actuals_for_date, delete unused fetch_recent_data.
- Frontend: MLReportPage renders the new todayActuals line.
- Tests: new test_ml_forecast_cache.py covering predictor alignment, cache
routing, stale eviction, lazy regeneration, retrain wipe/repopulate and
the 22:00 -> 00:00 boundary walk.
b2d8577 to
8fcd2d8
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Re-opens #34 against v8.1.1. Still experimental — see the open question below.
Summary
Adds an optional
ml_predictionconsumption strategy using XGBoost to predict 24h energy consumption at 15-minute resolution, plus an ML Report dashboard page showing model metrics, feature importance, and a forecast-vs-actual chart.ml/module: training, prediction, feature engineering, standalone CLI/api/ml-reportendpoint + frontend pageChanges since #34
upstream/main(v8.1.1).9cb535c): forecast cache is now anchored bytarget_dateso the ML Report chart stays aligned after midnight rollover.Open question — does this belong in core?
The original PR discussion flagged this as unresolved. Summary for the record:
Model accuracy has not materially improved since, but the ML Report page is a useful playground for prediction experimentation. This PR is re-opened as a draft purely to invite comment/contribution — totally happy for it to stay closed if you'd rather keep ML out of core and eventually spin it into its own add-on using
ha-add-on-template. In the meantime thefeat/ml-predictor-v8branch lives on my fork and merges cleanly into a localdeploybranch for testing.Costs of merging:
Test plan
mlconfig is presentml_predictionstrategy produces 96-period forecastfixedwhenmlconfig is missingmlsection absent (no ML features loaded)References