Skip to content

Security hardening: debug-RCE, auth, denial-of-wallet, sim deadline, XSS, path traversal, CSP#694

Open
yocyber-code wants to merge 4 commits into
666ghj:mainfrom
yocyber-code:harden/critical-c1-c4
Open

Security hardening: debug-RCE, auth, denial-of-wallet, sim deadline, XSS, path traversal, CSP#694
yocyber-code wants to merge 4 commits into
666ghj:mainfrom
yocyber-code:harden/critical-c1-c4

Conversation

@yocyber-code

Copy link
Copy Markdown

Summary

Hardens the backend + frontend against a set of issues found in a security review. Each fix is independently verified and the project still builds (npm run build + py_compile clean). Changes are backward-compatible via env gates; new variables are documented in .env.example / frontend/.env.example / README.

Critical

  • Werkzeug debug server / RCEFLASK_DEBUG now defaults to false; production runs under gunicorn (npm run start), frontend is served from a static vite build. Config.validate() runs inside create_app() so the gunicorn path also enforces required config at boot.
  • No authentication — every /api/* route now requires an API key (X-API-Key / Authorization: Bearer), constant-time compare, AUTH_ENABLED fail-closed. The bundled UI sends the key from a build-time VITE_API_KEY (wired through the Docker build-arg). /health + CORS preflight exempt.
  • Denial-of-walletOASIS_DEFAULT_MAX_ROUNDS is now actually applied when the client omits max_rounds, plus hard ceilings OASIS_MAX_ROUNDS_CAP / OASIS_MAX_AGENTS_CAP, enforced end-to-end (runner forwards the clamped value to the subprocess).
  • Simulation can hang forever — every env.step (initial / round loop / interview, all 3 run scripts) is wrapped in asyncio.wait_for with a per-round timeout, plus a total wall-clock deadline; asyncio.gather(return_exceptions=True) + per-platform isolation so one platform's failure can't skip env.close().

High / Medium

  • Stored XSS — agent/LLM/report output rendered via v-html / innerHTML is now run through DOMPurify.sanitize (all sinks).
  • Wildcard CORS — origins now come from ALLOWED_ORIGINS (env) instead of *.
  • Log / error leakage — request bodies are no longer written to disk in production (log level follows FLASK_DEBUG); tracebacks are returned to clients only in debug, otherwise logged server-side (safe_traceback()).
  • Upload content sniffing — magic-byte check so a renamed binary can't pass the extension allowlist (PDF header / NUL-byte check, BOM-aware for UTF-16/32 text).
  • Path traversalvalidate_id() guards every id→filesystem boundary (project / simulation / report dir-builders, runner run-state dir, report loggers) before join / makedirs / rmtree.
  • Missing CSP / security headers — CSP <meta> + vite preview headers (X-Frame-Options, X-Content-Type-Options, Referrer-Policy, frame-ancestors) + a Flask after_request that sets the same on API responses.

Notes

  • Tunables (all env, safe defaults): SECRET_KEY, API_KEY / AUTH_ENABLED, ALLOWED_ORIGINS, OASIS_*_CAP / OASIS_*_TIMEOUT_SEC, VITE_API_KEY.
  • Run production with a single gunicorn worker (-w 1 --threads N); simulation run-state is held in-process.
  • VITE_API_KEY baked into the client bundle is extractable by design — fine for single-host / internal / gateway-fronted deployments; for multi-tenant exposure replace with session auth.

🤖 Generated with Claude Code

Yo-LRK and others added 4 commits June 13, 2026 17:19
Closes the four CRITICAL findings from the 2026-06-13 review.

C1 — Werkzeug debug RCE / dev server in prod:
  - FLASK_DEBUG defaults False (kills interactive-debugger network RCE)
  - production runs gunicorn (-w 1 --threads 8) via `npm run start`; Dockerfile
    builds the frontend and serves it with `vite preview` (host 0.0.0.0); gunicorn
    + uv.lock updated
  - Config.validate() now runs inside create_app() so the gunicorn path enforces
    SECRET_KEY (prod) / API_KEY / LLM / ZEP at boot

C2 — zero auth on all /api/* routes:
  - before_request API-key guard (X-API-Key / Bearer), constant-time bytes compare,
    /health + OPTIONS exempt
  - AUTH_ENABLED fail-closed parse (only explicit false/0/no/off disables)
  - frontend axios injects X-API-Key from build-time VITE_API_KEY, wired through
    docker compose build-arg -> Dockerfile ARG -> vite build (+ frontend/.env.example)

C3 — denial-of-wallet (no cost ceiling; OASIS_DEFAULT_MAX_ROUNDS was dead config):
  - OASIS_DEFAULT_MAX_ROUNDS now applied when max_rounds omitted (default 150, covers
    the 144-round demo); hard ceilings OASIS_MAX_ROUNDS_CAP / OASIS_MAX_AGENTS_CAP;
    runner always forwards the clamped rounds to the subprocess

C4 — no simulation deadline; env.step could wedge forever:
  - every env.step (initial / round-loop / interview) wrapped in asyncio.wait_for
    (OASIS_ROUND_TIMEOUT_SEC) across all 3 run scripts; per-loop total-deadline
    (OASIS_RUN_TIMEOUT_SEC); gather(return_exceptions=True) + single-platform
    try/except so one platform's failure can't skip env.close

New env vars documented in .env.example + README security section.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…leak)

H1 — stored XSS via v-html / innerHTML of unsanitized LLM/agent/interview/report
content: add DOMPurify; both renderMarkdown() now return DOMPurify.sanitize(html)
(Step4Report.vue, Step5Interaction.vue), and the formatAnswer innerHTML sink is
wrapped in DOMPurify.sanitize. All 8 HTML-injection sinks now sanitized; markdown
rendering preserved (DOMPurify secure defaults keep the md-* tags/classes).

H4 — wildcard CORS: CORS origins now Config.ALLOWED_ORIGINS (comma-separated env,
default localhost:3000) instead of '*'.

H5 — request bodies written to disk in cleartext: logger file level now follows
FLASK_DEBUG (INFO in prod), so the before_request body-debug log is suppressed in
production. (traceback-in-response across 53 handlers deferred to a separate refactor.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…, CSP)

H5 (traceback leak, completing the prior log-level fix): new app/utils/security.py
safe_traceback() logs the full stack server-side and returns it to clients only when
FLASK_DEBUG; all 53 traceback.format_exc() in api/{graph,report,simulation}.py now call
it (import traceback removed).

Upload content sniff: upload_content_ok() magic-byte check — pdf must start %PDF-,
txt/md/markdown rejected if they contain NUL bytes (BOM-prefixed UTF-16/32/8 text
allowed). Wired into the graph.py upload loop so a renamed binary can't pass the
extension whitelist.

Path validation: validate_id() (^[A-Za-z0-9_-]{1,64}$) blocks traversal before every
id->filesystem sink — ProjectManager._get_project_dir, SimulationManager._get_simulation_dir,
ReportManager._get_report_folder + the two Report*Logger __init__, and a new
SimulationRunner._run_dir() that all RUN_STATE_DIR joins route through.

CSP / security headers: CSP <meta> in index.html, vite preview.headers
(X-Frame-Options/nosniff/Referrer-Policy + CSP frame-ancestors), and a Flask
after_request that sets the same headers on API responses.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Follow-up to the traceback gate: client-facing exception detail is now gated too.

- safe_error(e) (app/utils/security.py): returns the exception string only when
  FLASK_DEBUG, otherwise a generic message. Full detail stays in server logs.
- All catch-all `except Exception` handlers in api/{graph,report,simulation}.py now
  return safe_error(e) instead of str(e); same for the persisted error fields
  (project.error / state.error / task fail messages). Typed `except ValueError`
  validation handlers (404/400) keep str(e) — those are intentional, actionable
  user messages that echo only user-supplied ids.
- graph_builder no longer puts a full traceback into the task error (logs it
  server-side with exc_info, surfaces safe_error to the client); the batch-failure
  progress message is gated too.
- README documents the CSP connect-src ↔ VITE_API_BASE_URL coupling.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants