You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Four related production-code findings surfaced during the CodeRabbit review of #467. None of them is causing user-facing bugs today, but each is a real correctness concern when persistence fails (disk full, permission error, process killed mid-write). Deferred from #468 to keep that PR test-only; tracking them here so each can land in its own PR with proper test coverage of the failure mode.
1. src/python/web/handler/auto_queue.py — wrap to_file in try/except
Around line 47-49. The handler mutates state via add_pattern / remove_pattern and then writes with self.__auto_queue_persist.to_file(self.__persist_path), but doesn't catch IO errors. A persistence failure bubbles up as an unhandled 500 with a stack trace.
Suggested change: wrap the persistence call in try/except, log the exception (including the error object), and return a controlled HTTP 5xx (e.g. HTTPResponse(status=500, body=\"failed to persist auto-queue\")). Apply to both endpoints that mutate state (add and remove pattern paths).
Risk if shipped without this: disk-full on the config volume leaves the in-memory state ahead of the on-disk state. A subsequent restart silently reverts the change.
Around line 76-79.set_property mutates self.__configbefore persistence. If to_file fails after the mutation, in-memory state diverges from disk — and __on_lftp_config_change() may have already fired with the half-applied change.
Suggested change: create a copy of the config, apply inner_config.set_property(key, value) on the copy, attempt to write to disk via self.__config.to_file(self.__config_path) (already atomic via temp file + rename per #433), and only on write success replace self.__config with the updated copy and call __on_lftp_config_change(). Exceptions during write must prevent the in-memory swap and the LFTP callback.
Risk if shipped without this: the LFTP hot-reload path (#433) can fire on a value that didn't make it to disk. After restart the LFTP runtime would reset to the on-disk value, contradicting what the UI showed.
3. src/python/web/handler/integrations.py — persistence ordering on detach
Around line 113-117. The delete-instance handler calls self.__path_pairs_config.detach_arr_target(instance_id), then writes self.__config.to_file (integrations) beforeself.__path_pairs_config.to_file. If the process crashes between the two writes, the on-disk integrations file no longer references the instance but the path-pairs file still has dangling arr_target_ids pointing at it.
Suggested change: save in the order path_pairs → integrations. Persist path-pairs (with the detach applied) first, then write the integrations config. Same methods, same paths — just the order of the two to_file calls.
Risk if shipped without this: unclean shutdown leaves dangling references. The cross-validation in #426 will reject those path pairs on next load.
4. src/python/web/handler/path_pairs.py — wrap to_file in try/except
Around line 117-118. Same shape as #1: create / update / delete handlers all return HTTPResponse(... status=201) (or 204) without catching exceptions from self.__config.to_file(self.__path_pairs_path).
Suggested change: wrap each to_file call in try/except. On exception, log the error and return:
so the client gets a structured error instead of a stack trace.
Risk if shipped without this: same divergence pattern as #1 — UI thinks the change took, on-disk state says otherwise.
How to tackle
These are independent enough that they can land as separate PRs (one per file). Each needs:
The fix itself
A unit test for the failure mode (unittest.mock.patch on to_file to raise an exception, then assert the handler returns the right status and the in-memory state was not mutated)
Suggested order: #2 (config) first, since it has the largest blast radius (touches LFTP hot-reload). Then #3 (ordering — small, mechanical). Then #1 and #4 (mirror each other; could be one PR with shared test scaffolding).
Four related production-code findings surfaced during the CodeRabbit review of #467. None of them is causing user-facing bugs today, but each is a real correctness concern when persistence fails (disk full, permission error, process killed mid-write). Deferred from #468 to keep that PR test-only; tracking them here so each can land in its own PR with proper test coverage of the failure mode.
1.
src/python/web/handler/auto_queue.py— wrapto_filein try/exceptAround line 47-49. The handler mutates state via
add_pattern/remove_patternand then writes withself.__auto_queue_persist.to_file(self.__persist_path), but doesn't catch IO errors. A persistence failure bubbles up as an unhandled 500 with a stack trace.Suggested change: wrap the persistence call in
try/except, log the exception (including the error object), and return a controlled HTTP 5xx (e.g.HTTPResponse(status=500, body=\"failed to persist auto-queue\")). Apply to both endpoints that mutate state (add and remove pattern paths).Risk if shipped without this: disk-full on the config volume leaves the in-memory state ahead of the on-disk state. A subsequent restart silently reverts the change.
2.
src/python/web/handler/config.py— copy-then-write atomicityAround line 76-79.
set_propertymutatesself.__configbefore persistence. Ifto_filefails after the mutation, in-memory state diverges from disk — and__on_lftp_config_change()may have already fired with the half-applied change.Suggested change: create a copy of the config, apply
inner_config.set_property(key, value)on the copy, attempt to write to disk viaself.__config.to_file(self.__config_path)(already atomic via temp file + rename per #433), and only on write success replaceself.__configwith the updated copy and call__on_lftp_config_change(). Exceptions during write must prevent the in-memory swap and the LFTP callback.Risk if shipped without this: the LFTP hot-reload path (#433) can fire on a value that didn't make it to disk. After restart the LFTP runtime would reset to the on-disk value, contradicting what the UI showed.
3.
src/python/web/handler/integrations.py— persistence ordering on detachAround line 113-117. The delete-instance handler calls
self.__path_pairs_config.detach_arr_target(instance_id), then writesself.__config.to_file(integrations) beforeself.__path_pairs_config.to_file. If the process crashes between the two writes, the on-disk integrations file no longer references the instance but the path-pairs file still has danglingarr_target_idspointing at it.Suggested change: save in the order
path_pairs → integrations. Persist path-pairs (with the detach applied) first, then write the integrations config. Same methods, same paths — just the order of the twoto_filecalls.Risk if shipped without this: unclean shutdown leaves dangling references. The cross-validation in #426 will reject those path pairs on next load.
4.
src/python/web/handler/path_pairs.py— wrapto_filein try/exceptAround line 117-118. Same shape as #1: create / update / delete handlers all return
HTTPResponse(... status=201)(or 204) without catching exceptions fromself.__config.to_file(self.__path_pairs_path).Suggested change: wrap each
to_filecall intry/except. On exception, log the error and return:so the client gets a structured error instead of a stack trace.
Risk if shipped without this: same divergence pattern as #1 — UI thinks the change took, on-disk state says otherwise.
How to tackle
These are independent enough that they can land as separate PRs (one per file). Each needs:
unittest.mock.patchonto_fileto raise an exception, then assert the handler returns the right status and the in-memory state was not mutated)__on_lftp_config_changewas not calledSuggested order: #2 (config) first, since it has the largest blast radius (touches LFTP hot-reload). Then #3 (ordering — small, mechanical). Then #1 and #4 (mirror each other; could be one PR with shared test scaffolding).