Skip to content

[WIP] ML energy consumption predictor (experimental, v8.1.1)#69

Draft
pookey wants to merge 2 commits into
johanzander:mainfrom
pookey:feat/ml-predictor-v8
Draft

[WIP] ML energy consumption predictor (experimental, v8.1.1)#69
pookey wants to merge 2 commits into
johanzander:mainfrom
pookey:feat/ml-predictor-v8

Conversation

@pookey

@pookey pookey commented Apr 15, 2026

Copy link
Copy Markdown
Contributor

⚠️ WIP / casual work-in-progress. I'm still tinkering with this feature casually on my own hardware. This PR is raised purely so anyone else who's interested can follow along, comment, or contribute — no expectation of merge. The "should this live in core?" question below is still open; happy to keep this as a draft indefinitely or close it again if you'd rather.

Re-opens #34 against v8.1.1. Still experimental — see the open question below.

Summary

Adds an optional ml_prediction consumption strategy using XGBoost to predict 24h energy consumption at 15-minute resolution, plus an ML Report dashboard page showing model metrics, feature importance, and a forecast-vs-actual chart.

  • Complete ml/ module: training, prediction, feature engineering, standalone CLI
  • /api/ml-report endpoint + frontend page
  • Daily retrain scheduler (23:00)
  • Docker base image: Alpine → Debian Bookworm (xgboost / scikit-learn wheels)

Changes since #34

  • Rebased onto upstream/main (v8.1.1).
  • ML forecast cache bug fix (commit 9cb535c): forecast cache is now anchored by target_date so the ML Report chart stays aligned after midnight rollover.

Open question — does this belong in core?

The original PR discussion flagged this as unresolved. Summary for the record:

@johanzander (2026-03-12): "it probably makes more sense to have it outside the BESS manager. [...] would you be open to create a separate HA Add-On for this? then that could serve as another configurable consumption strategy option..."

@pookey (2026-03-12): "So far, it's been trash and after 8 days of training data, the 'weekly average' is working better for me. [...] I think the ML stuff currently falls into the 'cool experiment, but probably not useful' category!"

Model accuracy has not materially improved since, but the ML Report page is a useful playground for prediction experimentation. This PR is re-opened as a draft purely to invite comment/contribution — totally happy for it to stay closed if you'd rather keep ML out of core and eventually spin it into its own add-on using ha-add-on-template. In the meantime the feat/ml-predictor-v8 branch lives on my fork and merges cleanly into a local deploy branch for testing.

Costs of merging:

  • ~180 MB added dependencies (xgboost, scikit-learn, astral)
  • Alpine → Debian base image
  • 3–5s retrain-on-boot + weather-fetch startup cost
  • Requires weather entity + InfluxDB configured

Test plan

  • ML model trains on boot when ml config is present
  • ML Report page renders metrics, feature importance, chart
  • ml_prediction strategy produces 96-period forecast
  • Fallback to fixed when ml config is missing
  • Daily retrain fires at 23:00
  • System starts when ml section absent (no ML features loaded)

References

@pookey pookey force-pushed the feat/ml-predictor-v8 branch 2 times, most recently from 52bd646 to b2d8577 Compare April 18, 2026 07:16
ipc-zpg and others added 2 commits April 20, 2026 07:20
Add ml_prediction consumption strategy using XGBoost to generate
96 quarter-hourly energy consumption forecasts. The model trains
on InfluxDB historical data with weather forecast features from
Home Assistant and retrains daily at 23:00.

Includes:
- ML module (ml/) with trainer, predictor, data_fetcher, config
- ML Report dashboard page showing model metrics, feature
  importance, and forecast comparison
- /api/ml-report endpoint
- Docker base image switch to Debian Bookworm for xgboost/sklearn
- Daily retrain scheduler in app.py

The ML predictor is experimental — the influxdb_7d_avg strategy
is recommended for production use.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The ML Report showed a rolling 24h window starting from the next quarter
hour instead of a calendar-day forecast, and "yesterday" on the chart was
actually today's partial data.

Root causes: _build_future_timestamps was anchored to datetime.now(), and
the BSM cache was keyed by generation day rather than target day, so today
and tomorrow could never coexist. Also fixes a dormant bug where
_generate_ml_predictions imported a function name that never existed
(predict vs predict_next_24h), so the ML path had never populated the
cache on this branch.

- Predictor: _build_future_timestamps now requires target_date and anchors
  at local midnight; predict_next_24h and predict_with_timestamps thread
  target_date into fetch_history_context. Heavy deps (xgboost,
  feature_engineer) moved to lazy imports so tests collect without them.
- BSM: _ml_forecast_cache is now dict[date, list[float]]; stale entries
  evict on access; _retrain_ml_model wipes and repopulates {today,
  tomorrow}; short predictions (when HA weather can't reach into the past
  of today) are front-padded so the cached vector stays calendar-aligned.
- _get_consumption_forecast now requires target_date; _gather_optimization_data
  passes today or tomorrow based on prepare_next_day.
- /api/ml-report: prefers today when cached, falls back to newest;
  yesterday/week-avg/today-actuals are always anchored to the real calendar
  today; when showing today's forecast the already-elapsed quarters are
  masked out so the ML Predicted line only draws from "now" forward and the
  new red "today so far" trace fills the past.
- ml/data_fetcher: add fetch_actuals_for_date, delete unused fetch_recent_data.
- Frontend: MLReportPage renders the new todayActuals line.
- Tests: new test_ml_forecast_cache.py covering predictor alignment, cache
  routing, stale eviction, lazy regeneration, retrain wipe/repopulate and
  the 22:00 -> 00:00 boundary walk.
@pookey pookey force-pushed the feat/ml-predictor-v8 branch from b2d8577 to 8fcd2d8 Compare April 20, 2026 06:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants