Found during the post-#90 review pass on PR #93. Reuse/maintainability, not a bug.
Problem
src/forgather/ml/diloco/auth.py re-implements, nearly byte-for-byte, helpers that already exist in tools/dataset_server/auth.py:
write_standalone_token / standalone_token_file / diloco_tokens_dir — the os.open+fchmod+os.replace atomic-write dance and the 0700 parent-dir tightening.
url_is_local / url_port / read_standalone_token and the _LOCAL_HOSTS set (which also re-implements forgather.tls.policy.host_is_loopback, less correctly — the literal set misses 127.0.0.0/8 addresses).
peer_cert_authenticated mirrors tools/forgather_server/auth.py:_request_has_client_cert.
The stdlib-vs-FastAPI split justifies a separate verifier (verify_bearer/_send_401), but the transport-agnostic token-file I/O, URL-loopback classification, and peer-cert chain check have no framework dependency and now live in 2-3 places. A future hardening (e.g. fsync-before-replace, TOCTOU dir-mode tightening, SAN/CN allowlist on the cert check) has to be found and applied in each copy or they drift.
Fix
Extract a shared module (e.g. forgather.security.token_store parameterized by service subdir name, and reuse forgather.tls.policy.host_is_loopback). Have dataset_server, diloco, and inference_server all call it.
Minor adjacent hardening to fold in: the register audit record writes worker-supplied hostname verbatim — json.dumps neutralizes injection, but a length cap would prevent audit-log bloat from a malicious worker.
Ref: #90.
Found during the post-#90 review pass on PR #93. Reuse/maintainability, not a bug.
Problem
src/forgather/ml/diloco/auth.pyre-implements, nearly byte-for-byte, helpers that already exist intools/dataset_server/auth.py:write_standalone_token/standalone_token_file/diloco_tokens_dir— theos.open+fchmod+os.replaceatomic-write dance and the 0700 parent-dir tightening.url_is_local/url_port/read_standalone_tokenand the_LOCAL_HOSTSset (which also re-implementsforgather.tls.policy.host_is_loopback, less correctly — the literal set misses127.0.0.0/8addresses).peer_cert_authenticatedmirrorstools/forgather_server/auth.py:_request_has_client_cert.The stdlib-vs-FastAPI split justifies a separate verifier (
verify_bearer/_send_401), but the transport-agnostic token-file I/O, URL-loopback classification, and peer-cert chain check have no framework dependency and now live in 2-3 places. A future hardening (e.g. fsync-before-replace, TOCTOU dir-mode tightening, SAN/CN allowlist on the cert check) has to be found and applied in each copy or they drift.Fix
Extract a shared module (e.g.
forgather.security.token_storeparameterized by service subdir name, and reuseforgather.tls.policy.host_is_loopback). Have dataset_server, diloco, and inference_server all call it.Minor adjacent hardening to fold in: the register audit record writes worker-supplied
hostnameverbatim —json.dumpsneutralizes injection, but a length cap would prevent audit-log bloat from a malicious worker.Ref: #90.