Skip to content

Model downloader manager#14657

Open
Talmaj wants to merge 2 commits into
masterfrom
model_downloader
Open

Model downloader manager#14657
Talmaj wants to merge 2 commits into
masterfrom
model_downloader

Conversation

@Talmaj

@Talmaj Talmaj commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

@coderabbitai

coderabbitai Bot commented Jun 27, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

This PR introduces a full server-side model download manager for ComfyUI. It adds an Alembic migration and SQLAlchemy ORM models for downloads, download_segments, and host_credentials tables. A security layer provides SSRF protection via a custom DNS resolver, URL allowlisting, and model path containment checks. A credential subsystem stores per-host API keys and resolves them per redirect hop over HTTPS only. The HTTP networking layer follows redirects manually, reattaching credentials at each hop. A DownloadJob engine handles segmented and single-stream transfers with resume, pause, cancel, and SHA-256/structural verification. A Scheduler manages priority-based job admission, backoff, and retry. A DownloadManager facade exposes enqueue, control, and status APIs, wired into aiohttp routes under /api/download and pushed to clients via websocket. CLI arguments and a feature flag complete the integration.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 21.24% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: adding a model downloader manager.
Description check ✅ Passed The description is related to the changeset because it points to the model download manager PRD.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

🟡 Minor comments (5)
app/model_downloader/engine/job.py-25-29 (1)

25-29: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Remove the unused SegmentPlan import to get Ruff green again.

Line 26 is currently failing the lint job.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/model_downloader/engine/job.py` around lines 25 - 29, Remove the unused
SegmentPlan import from the app.model_downloader.engine.planner import block in
job.py so Ruff stops flagging the file; keep the remaining planner imports
(effective_segment_count and plan_segments) unchanged and verify no other
references to SegmentPlan exist in this module.

Source: Pipeline failures

app/model_downloader/engine/planner.py-40-41 (1)

40-41: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Don't synthesize a 1-byte segment for empty files.

When total_bytes == 0, this returns start=0, end=0, so downstream code sees a byte that does not exist. That turns a valid empty payload into a bogus bytes=0-0 range plan. Return no segments for zero-length files instead of inventing one.

Possible fix
 def plan_segments(total_bytes: int, num_segments: int) -> list[SegmentPlan]:
-    if total_bytes <= 0 or num_segments <= 1:
-        return [SegmentPlan(idx=0, start=0, end=max(0, total_bytes - 1))]
+    if total_bytes <= 0:
+        return []
+    if num_segments <= 1:
+        return [SegmentPlan(idx=0, start=0, end=total_bytes - 1)]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/model_downloader/engine/planner.py` around lines 40 - 41, Update the
segment planning logic in the planner function so zero-length inputs do not
produce a fake byte range. In the segment plan branch that currently handles
total_bytes and num_segments, return an empty list when total_bytes is 0, and
keep the single-segment fallback only for non-empty files. Use the existing
SegmentPlan and planner logic to ensure empty payloads produce no segments
instead of a bytes=0-0 plan.
app/model_downloader/security/allowlist.py-23-33 (1)

23-33: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Keep IPv6 loopback consistent with the localhost allowlist.

LOOPBACK_HOSTS includes ::1, but _DEFAULT_ALLOWED_HOSTS does not, so http://[::1]/model.safetensors is rejected even though this file says loopback localhost downloads are intentionally supported.

Suggested fix
 _DEFAULT_ALLOWED_HOSTS: dict[str, set[str]] = {
     "huggingface.co": {"https"},
     "civitai.com": {"https"},
     "localhost": {"http", "https"},
     "127.0.0.1": {"http", "https"},
+    "::1": {"http", "https"},
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/model_downloader/security/allowlist.py` around lines 23 - 33,
`allowlist.py` has an inconsistency between `LOOPBACK_HOSTS` and
`_DEFAULT_ALLOWED_HOSTS`: `::1` is treated as a permitted loopback host but is
missing from the default allowlist. Update `_DEFAULT_ALLOWED_HOSTS` in the
allowlist module to include `::1` with the same supported schemes as the other
local hosts so `http://[::1]/...` is accepted consistently with the localhost
download behavior.
tests-unit/model_downloader_test/test_engine_integration.py-116-120 (1)

116-120: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Use structurally valid safetensors payloads, or switch these transfer tests to .ckpt.

These tests download arbitrary bytes into .safetensors destinations, but structural.validate() now rejects invalid safetensors files. If the tests only cover transfer/resume/cancel behavior, use a non-structural extension consistently for the URL and model_id.

Also applies to: 143-147, 171-183, 205-209, 241-244

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests-unit/model_downloader_test/test_engine_integration.py` around lines 116
- 120, The transfer tests in test_engine_integration.py are downloading
arbitrary bytes into .safetensors targets, which now fail structural validation.
Update the affected cases around the DownloadJob setup and _insert calls to
either use structurally valid safetensors payloads, or change the test data
consistently to a non-structural extension like .ckpt for the URL, model_id, and
destination paths so the tests still cover transfer/resume/cancel behavior
without triggering structural.validate().
tests-unit/model_downloader_test/conftest.py-22-23 (1)

22-23: 🔒 Security & Privacy | 🟡 Minor | ⚡ Quick win

Avoid mktemp() for the SQLite test DB path.

tempfile.mktemp() leaves a race window before SQLite creates the file. Use a securely-created temp file and pass that path to database_url.

Proposed fix
-    db_path = tempfile.mktemp(suffix="-dlmgr-test.sqlite3")
+    fd, db_path = tempfile.mkstemp(suffix="-dlmgr-test.sqlite3")
+    os.close(fd)
     args.database_url = f"sqlite:///{db_path}"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests-unit/model_downloader_test/conftest.py` around lines 22 - 23, The
SQLite test database path in the model_downloader test setup currently uses
tempfile.mktemp(), which creates a race window before the database file exists.
Update the conftest.py fixture/setup around args.database_url to use a securely
created temporary file or temporary directory path instead, then build the
sqlite:/// URL from that actual path. Keep the change localized to the test DB
setup and the db_path/database_url assignment.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/model_downloader/credentials/resolver.py`:
- Around line 34-40: `apply_to_url()` is rebuilding the full query string
instead of preserving the original URL’s existing query intact. Update
`apply_to_url` in `resolver.py` so it only appends the credentials from
`self.query` to the existing `parts.query` without parsing and re-encoding prior
parameters, keeping repeated keys and original encoding unchanged. Use
`urlsplit`, `urlunsplit`, and the `apply_to_url`/`self.query` logic as the
touchpoints when fixing the query merge behavior.

In `@app/model_downloader/credentials/store.py`:
- Around line 25-38: The normalize_host function is incorrectly treating pasted
full URLs as host:port values, causing scheme strings like https to be persisted
as the host key. Update normalize_host in store.py to detect full URLs and
reject them or return empty before the host:port splitting logic runs, so only
real hosts are normalized. Keep the existing IDNA and IPv6 handling for valid
host inputs, but ensure normalize_host does not collapse URL strings into
scheme-only values.
- Around line 111-115: The credentials payload in store.py leaks short secrets
because secret_last4 falls back to the entire secret when its length is 4 or
less. Update the secret handling in the store logic so that secret_last4 never
contains the full token for short values; instead, use a non-revealing
placeholder or an empty value while keeping the behavior for longer secrets
unchanged. Make the fix in the code that builds the values dict for credential
storage so the API can safely list credentials without exposing write-only
secrets.

In `@app/model_downloader/database/models.py`:
- Around line 69-70: The `credential_id` field on the download model is only a
raw scalar today, so it can dangle after a credential is removed. Update the
model around `credential_id` to use a real foreign key/relationship to
`host_credentials.id`, and make the delete behavior enforce integrity so
referenced credentials cannot be removed while still in use. Keep the change
localized to the download model class and its association to the host credential
model.

In `@app/model_downloader/database/queries.py`:
- Around line 205-222: The host credential write path in the upsert logic is
doing a separate read then insert/update, which can race for the same host.
Update the HostCredential flow in the query helper to use a single atomic
database-level upsert keyed by host, or add a unique-conflict retry/merge path,
so concurrent calls converge safely instead of creating duplicates or failing on
constraint errors.

In `@app/model_downloader/engine/job.py`:
- Around line 431-441: The _set_status method in Job is leaving stale failure
text behind because it only updates state.error and the persisted fields when a
new error is provided. Update _set_status to explicitly clear error when
transitioning out of a failure state, and ensure queries.update_download
receives a NULL/clearing value so RuntimeState and stored status are reset on
retry or success. If update_download currently ignores None, add the matching
path there as well so DownloadStatus transitions can remove old error messages
reliably.

In `@app/model_downloader/engine/writer.py`:
- Around line 46-50: The write_at method in Writer currently ignores the return
value from os.pwrite, so partial writes can go unnoticed and leave the .part
file corrupted or incomplete. Update write_at to check the byte count returned
by os.pwrite and verify it matches len(data); if fewer bytes are written, handle
it explicitly by raising an error or otherwise retrying until the full buffer is
written. Keep the fix localized to Writer.write_at so callers continue using the
same API.

In `@app/model_downloader/manager.py`:
- Around line 75-105: The duplicate-download check in the enqueue path is racy
because `resolve_existing()`, `_has_live_download()`, and
`queries.insert_download()` are separate operations in `manager.py`. Move the
reservation into a single atomic database step in the `insert_download` flow (or
a transaction) so concurrent requests for the same `model_id` cannot both
succeed. Use conflict handling on the unique download target/key to reject
duplicates, and keep `DownloadError` mapping for `ALREADY_AVAILABLE` and
`ALREADY_DOWNLOADING` in the `manager` enqueue method.

In `@app/model_downloader/net/http.py`:
- Around line 74-76: The redirect handling in Http response processing is
incorrectly awaiting a synchronous release call, which will fail on the first
redirect. Update the redirect path in the response handling logic to call
ClientResponse.release() directly without await, and make the same change in the
cleanup path so the Http downloader consistently uses resp.release() in the
relevant flow.

In `@app/model_downloader/net/probe.py`:
- Around line 88-90: The probe failure handling in probe() is leaking sensitive
data by logging the full url and copying raw exception text into
ProbeResult.error. Update the exception path to avoid emitting the full URL or
str(e); instead log only a sanitized identifier or host-safe summary, and set
ProbeResult.error to a generic failure message without raw exception details.

In `@app/model_downloader/security/ssrf.py`:
- Line 35: The redirect scheme allowlist in ssrf.py is too broad because
ALLOWED_SCHEMES currently permits plain http for all hosts; tighten the logic in
the redirect/URL validation path so https remains required for external
destinations, and only allow http when the target host is explicitly loopback or
other approved dev/test hosts. Update the relevant scheme check and any helper
used by the downloader security flow so the restriction is enforced consistently
across the referenced redirect handling code.
- Around line 74-85: Reject blocked IP-literal redirect targets directly in
check_redirect_hop(), since ValidatingResolver only filters DNS results and can
be bypassed by literal hosts like 127.0.0.1. Update the redirect-hop validation
logic to detect IP hosts before following the redirect and reuse the existing
blocked-IP checks used by ValidatingResolver/is_blocked_ip, while still allowing
the explicit loopback allowlist behavior where intended.

In `@app/model_downloader/verify/structural.py`:
- Around line 44-58: Normalize all malformed safetensors header cases in the
structural check by raising StructuralError from the verify logic in
structural.py. In the header parsing/validation path around the existing
json.loads call and the loop over header.items(), first ensure the decoded
header is a dict before iterating, then validate each entry’s data_offsets as a
two-item list of integers with non-negative, strictly increasing bounds. Convert
any non-dict header, non-integer offsets, negative values, or reversed ranges
into a clear StructuralError instead of letting AttributeError/ValueError escape
or silently accepting bad offsets.

In `@comfy/cli_args.py`:
- Around line 247-250: Reject non-positive values at the CLI boundary for the
downloader limit flags in cli_args.py. Update the argparse definitions for
--download-segments, --download-max-active, --download-max-connections-per-host,
and --download-chunk-size to use a shared positive-integer validator instead of
plain int so values like 0 or negatives are rejected before reaching the
downloader. Add or reuse a helper such as _positive_int near the parser setup
and keep the existing defaults unchanged.

In `@openapi.yaml`:
- Around line 2528-2530: The OpenAPI spec is using the download tag on
list-downloads operations, but the tag is not declared in the root tags
collection. Update the top-level tags definition in openapi.yaml to include a
global download tag entry, keeping it consistent with the other tag declarations
so the lint step passes.

In `@server.py`:
- Around line 1193-1199: The _notify helper inside the download progress flow is
broadcasting the full status_sync() view, including the raw download url, to
every websocket client via send_sync with no session scoping. Update the
download notification path to either redact/remove the url field from the view
before calling send_sync or limit delivery to the initiating session, using the
_notify, DOWNLOAD_MANAGER.status_sync, and self.send_sync logic to locate the
change.

---

Minor comments:
In `@app/model_downloader/engine/job.py`:
- Around line 25-29: Remove the unused SegmentPlan import from the
app.model_downloader.engine.planner import block in job.py so Ruff stops
flagging the file; keep the remaining planner imports (effective_segment_count
and plan_segments) unchanged and verify no other references to SegmentPlan exist
in this module.

In `@app/model_downloader/engine/planner.py`:
- Around line 40-41: Update the segment planning logic in the planner function
so zero-length inputs do not produce a fake byte range. In the segment plan
branch that currently handles total_bytes and num_segments, return an empty list
when total_bytes is 0, and keep the single-segment fallback only for non-empty
files. Use the existing SegmentPlan and planner logic to ensure empty payloads
produce no segments instead of a bytes=0-0 plan.

In `@app/model_downloader/security/allowlist.py`:
- Around line 23-33: `allowlist.py` has an inconsistency between
`LOOPBACK_HOSTS` and `_DEFAULT_ALLOWED_HOSTS`: `::1` is treated as a permitted
loopback host but is missing from the default allowlist. Update
`_DEFAULT_ALLOWED_HOSTS` in the allowlist module to include `::1` with the same
supported schemes as the other local hosts so `http://[::1]/...` is accepted
consistently with the localhost download behavior.

In `@tests-unit/model_downloader_test/conftest.py`:
- Around line 22-23: The SQLite test database path in the model_downloader test
setup currently uses tempfile.mktemp(), which creates a race window before the
database file exists. Update the conftest.py fixture/setup around
args.database_url to use a securely created temporary file or temporary
directory path instead, then build the sqlite:/// URL from that actual path.
Keep the change localized to the test DB setup and the db_path/database_url
assignment.

In `@tests-unit/model_downloader_test/test_engine_integration.py`:
- Around line 116-120: The transfer tests in test_engine_integration.py are
downloading arbitrary bytes into .safetensors targets, which now fail structural
validation. Update the affected cases around the DownloadJob setup and _insert
calls to either use structurally valid safetensors payloads, or change the test
data consistently to a non-structural extension like .ckpt for the URL,
model_id, and destination paths so the tests still cover transfer/resume/cancel
behavior without triggering structural.validate().
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 29f6940c-057d-4cff-b8d9-8c204e0a70d6

📥 Commits

Reviewing files that changed from the base of the PR and between 603d891 and 8d18675.

📒 Files selected for processing (33)
  • alembic_db/versions/0005_download_manager.py
  • app/database/db.py
  • app/model_downloader/api/routes.py
  • app/model_downloader/api/schemas_in.py
  • app/model_downloader/api/schemas_out.py
  • app/model_downloader/constants.py
  • app/model_downloader/credentials/resolver.py
  • app/model_downloader/credentials/store.py
  • app/model_downloader/database/models.py
  • app/model_downloader/database/queries.py
  • app/model_downloader/engine/job.py
  • app/model_downloader/engine/planner.py
  • app/model_downloader/engine/writer.py
  • app/model_downloader/manager.py
  • app/model_downloader/net/http.py
  • app/model_downloader/net/probe.py
  • app/model_downloader/net/session.py
  • app/model_downloader/scheduler.py
  • app/model_downloader/security/allowlist.py
  • app/model_downloader/security/paths.py
  • app/model_downloader/security/ssrf.py
  • app/model_downloader/verify/checksum.py
  • app/model_downloader/verify/dedup.py
  • app/model_downloader/verify/structural.py
  • comfy/cli_args.py
  • comfy_api/feature_flags.py
  • openapi.yaml
  • server.py
  • tests-unit/model_downloader_test/conftest.py
  • tests-unit/model_downloader_test/test_credentials.py
  • tests-unit/model_downloader_test/test_engine_integration.py
  • tests-unit/model_downloader_test/test_planner_structural.py
  • tests-unit/model_downloader_test/test_security.py

Comment on lines +34 to +40
def apply_to_url(self, url: str) -> str:
if not self.query:
return url
parts = urlsplit(url)
params = dict(parse_qsl(parts.query, keep_blank_values=True))
params.update(self.query)
return urlunsplit(parts._replace(query=urlencode(params)))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Don't rewrite the original query string here.

apply_to_url() currently round-trips the existing query through parse_qsl() + urlencode(). That drops repeated keys and normalizes encoding, so an already-signed download URL can be mutated into a different request before the auth param is added.

Possible fix
     def apply_to_url(self, url: str) -> str:
         if not self.query:
             return url
         parts = urlsplit(url)
-        params = dict(parse_qsl(parts.query, keep_blank_values=True))
-        params.update(self.query)
-        return urlunsplit(parts._replace(query=urlencode(params)))
+        extra_query = urlencode(self.query)
+        query = f"{parts.query}&{extra_query}" if parts.query else extra_query
+        return urlunsplit(parts._replace(query=query))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def apply_to_url(self, url: str) -> str:
if not self.query:
return url
parts = urlsplit(url)
params = dict(parse_qsl(parts.query, keep_blank_values=True))
params.update(self.query)
return urlunsplit(parts._replace(query=urlencode(params)))
def apply_to_url(self, url: str) -> str:
if not self.query:
return url
parts = urlsplit(url)
extra_query = urlencode(self.query)
query = f"{parts.query}&{extra_query}" if parts.query else extra_query
return urlunsplit(parts._replace(query=query))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/model_downloader/credentials/resolver.py` around lines 34 - 40,
`apply_to_url()` is rebuilding the full query string instead of preserving the
original URL’s existing query intact. Update `apply_to_url` in `resolver.py` so
it only appends the credentials from `self.query` to the existing `parts.query`
without parsing and re-encoding prior parameters, keeping repeated keys and
original encoding unchanged. Use `urlsplit`, `urlunsplit`, and the
`apply_to_url`/`self.query` logic as the touchpoints when fixing the query merge
behavior.

Comment on lines +25 to +38
def normalize_host(host: str) -> str:
"""Lowercase, strip port, IDNA-encode (PRD section 9.4.3)."""
if not host:
return ""
host = host.strip().lower()
if host.startswith("[") and "]" in host: # bracketed IPv6 literal
host = host[1 : host.index("]")]
elif host.count(":") == 1: # host:port (not IPv6)
host = host.split(":", 1)[0]
try:
host = host.encode("idna").decode("ascii")
except (UnicodeError, ValueError):
pass
return host

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Reject full URLs here instead of collapsing them to "http"/"https".

normalize_host("https://huggingface.co/foo") currently returns "https" because the host:port branch splits on the scheme colon. That makes pasted URLs overwrite the same persisted host key and guarantees the credential will never match the real host.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/model_downloader/credentials/store.py` around lines 25 - 38, The
normalize_host function is incorrectly treating pasted full URLs as host:port
values, causing scheme strings like https to be persisted as the host key.
Update normalize_host in store.py to detect full URLs and reject them or return
empty before the host:port splitting logic runs, so only real hosts are
normalized. Keep the existing IDNA and IPv6 handling for valid host inputs, but
ensure normalize_host does not collapse URL strings into scheme-only values.

Comment on lines +111 to +115
values = {
"host": host,
"secret": secret,
"secret_last4": secret[-4:] if len(secret) >= 4 else secret,
"auth_scheme": auth_scheme,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Don't expose the full secret for short tokens.

For secrets with length <= 4, secret_last4 currently stores the entire secret. That breaks the write-only guarantee as soon as credentials are listed back through the API.

Suggested fix
-            "secret_last4": secret[-4:] if len(secret) >= 4 else secret,
+            "secret_last4": secret[-4:] if len(secret) > 4 else None,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
values = {
"host": host,
"secret": secret,
"secret_last4": secret[-4:] if len(secret) >= 4 else secret,
"auth_scheme": auth_scheme,
values = {
"host": host,
"secret": secret,
"secret_last4": secret[-4:] if len(secret) > 4 else None,
"auth_scheme": auth_scheme,
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/model_downloader/credentials/store.py` around lines 111 - 115, The
credentials payload in store.py leaks short secrets because secret_last4 falls
back to the entire secret when its length is 4 or less. Update the secret
handling in the store logic so that secret_last4 never contains the full token
for short values; instead, use a non-revealing placeholder or an empty value
while keeping the behavior for longer secrets unchanged. Make the fix in the
code that builds the values dict for credential storage so the API can safely
list credentials without exposing write-only secrets.

Comment on lines +69 to +70
# Explicit credential override; otherwise auto-resolved by host.
credential_id: Mapped[str | None] = mapped_column(String(36), nullable=True)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift

Protect credential_id with an actual DB relationship.

downloads.credential_id is persisted, but nothing ties it to host_credentials.id. If a user deletes a stored credential while a paused/queued download still references it, resume/retry will keep the dangling id and silently drop auth instead of failing at delete time.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/model_downloader/database/models.py` around lines 69 - 70, The
`credential_id` field on the download model is only a raw scalar today, so it
can dangle after a credential is removed. Update the model around
`credential_id` to use a real foreign key/relationship to `host_credentials.id`,
and make the delete behavior enforce integrity so referenced credentials cannot
be removed while still in use. Keep the change localized to the download model
class and its association to the host credential model.

Comment on lines +205 to +222
with create_session() as session:
row = (
session.execute(
select(HostCredential).where(HostCredential.host == host).limit(1)
)
.scalars()
.first()
)
if row is None:
row = HostCredential(**values)
row.created_at = now
row.updated_at = now
session.add(row)
else:
for key, value in values.items():
setattr(row, key, value)
row.updated_at = now
session.commit()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift

Make the host upsert atomic.

This is a read-then-write sequence keyed by host. Two concurrent requests for the same host can both miss the SELECT; one then either inserts a duplicate credential or blows up on the unique constraint. Please collapse this into a single DB-level upsert (or retry on unique-conflict) so concurrent writes converge cleanly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/model_downloader/database/queries.py` around lines 205 - 222, The host
credential write path in the upsert logic is doing a separate read then
insert/update, which can race for the same host. Update the HostCredential flow
in the query helper to use a single atomic database-level upsert keyed by host,
or add a unique-conflict retry/merge path, so concurrent calls converge safely
instead of creating duplicates or failing on constraint errors.

Comment on lines +74 to +85
async def resolve(self, host, port=0, family=socket.AF_INET):
infos = await self._inner.resolve(host, port, family)
# localhost/127.0.0.1 are an explicit, opt-in allowlist feature.
if isinstance(host, str) and host.lower() in LOOPBACK_HOSTS:
return infos
safe = [info for info in infos if not is_blocked_ip(info["host"])]
if not safe:
raise OSError(
f"refusing to connect to {host!r}: resolves only to "
f"private/special-use addresses"
)
return safe

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify aiohttp connector behavior for IP literals.
python - <<'PY'
import inspect
import aiohttp.connector
print(inspect.getsource(aiohttp.connector.TCPConnector._resolve_host))
PY

Repository: Comfy-Org/ComfyUI

Length of output: 3666


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect the SSRF helper and redirect handling around the referenced lines.
git ls-files app/model_downloader/security/ssrf.py
wc -l app/model_downloader/security/ssrf.py
cat -n app/model_downloader/security/ssrf.py | sed -n '1,220p'

printf '\n--- redirect-related references ---\n'
rg -n "check_redirect_hop|redirect|SSRFError|is_blocked_ip|LOOPBACK_HOSTS" app/model_downloader/security -n

Repository: Comfy-Org/ComfyUI

Length of output: 7267


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect redirect callers and allowlist helpers.
wc -l app/model_downloader/security/allowlist.py
cat -n app/model_downloader/security/allowlist.py | sed -n '1,220p'

printf '\n--- check_redirect_hop callers ---\n'
rg -n "check_redirect_hop\(" -S app

printf '\n--- URL / redirect handling around model downloaders ---\n'
rg -n "redirect|Location|allowlist|SSRFError|ValidatingResolver" app/model_downloader -S

Repository: Comfy-Org/ComfyUI

Length of output: 9174


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect the manual redirect-following loop that calls check_redirect_hop.
cat -n app/model_downloader/net/http.py | sed -n '1,140p'

Repository: Comfy-Org/ComfyUI

Length of output: 4675


Reject blocked IP-literal redirect targets in check_redirect_hop(). ValidatingResolver is bypassed for IP literals, so a redirect to http://127.0.0.1:... can still reach loopback/private addresses unless the hop validator rejects blocked IP hosts itself.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/model_downloader/security/ssrf.py` around lines 74 - 85, Reject blocked
IP-literal redirect targets directly in check_redirect_hop(), since
ValidatingResolver only filters DNS results and can be bypassed by literal hosts
like 127.0.0.1. Update the redirect-hop validation logic to detect IP hosts
before following the redirect and reuse the existing blocked-IP checks used by
ValidatingResolver/is_blocked_ip, while still allowing the explicit loopback
allowlist behavior where intended.

Comment on lines +44 to +58
try:
header = json.loads(f.read(header_len).decode("utf-8"))
except (UnicodeDecodeError, json.JSONDecodeError) as e:
raise StructuralError(f"safetensors header is not valid JSON: {e}") from e

data_len = 0
for name, entry in header.items():
if name == "__metadata__":
continue
if not isinstance(entry, dict) or "data_offsets" not in entry:
raise StructuralError(f"tensor {name!r} missing data_offsets")
offsets = entry["data_offsets"]
if not (isinstance(offsets, list) and len(offsets) == 2):
raise StructuralError(f"tensor {name!r} has malformed data_offsets")
data_len = max(data_len, int(offsets[1]))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Normalize malformed safetensors headers into StructuralError.

A downloaded header can be valid JSON but not a dict, or contain non-integer / negative / reversed data_offsets; those currently leak AttributeError/ValueError or accept invalid offsets instead of failing the structural check cleanly.

Proposed fix
         try:
             header = json.loads(f.read(header_len).decode("utf-8"))
         except (UnicodeDecodeError, json.JSONDecodeError) as e:
             raise StructuralError(f"safetensors header is not valid JSON: {e}") from e
+        if not isinstance(header, dict):
+            raise StructuralError("safetensors header must be a JSON object")
 
     data_len = 0
     for name, entry in header.items():
         if name == "__metadata__":
             continue
@@
         offsets = entry["data_offsets"]
         if not (isinstance(offsets, list) and len(offsets) == 2):
             raise StructuralError(f"tensor {name!r} has malformed data_offsets")
-        data_len = max(data_len, int(offsets[1]))
+        try:
+            start, end = int(offsets[0]), int(offsets[1])
+        except (TypeError, ValueError) as e:
+            raise StructuralError(f"tensor {name!r} has non-integer data_offsets") from e
+        if start < 0 or end < start:
+            raise StructuralError(f"tensor {name!r} has invalid data_offsets")
+        data_len = max(data_len, end)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
header = json.loads(f.read(header_len).decode("utf-8"))
except (UnicodeDecodeError, json.JSONDecodeError) as e:
raise StructuralError(f"safetensors header is not valid JSON: {e}") from e
data_len = 0
for name, entry in header.items():
if name == "__metadata__":
continue
if not isinstance(entry, dict) or "data_offsets" not in entry:
raise StructuralError(f"tensor {name!r} missing data_offsets")
offsets = entry["data_offsets"]
if not (isinstance(offsets, list) and len(offsets) == 2):
raise StructuralError(f"tensor {name!r} has malformed data_offsets")
data_len = max(data_len, int(offsets[1]))
try:
header = json.loads(f.read(header_len).decode("utf-8"))
except (UnicodeDecodeError, json.JSONDecodeError) as e:
raise StructuralError(f"safetensors header is not valid JSON: {e}") from e
if not isinstance(header, dict):
raise StructuralError("safetensors header must be a JSON object")
data_len = 0
for name, entry in header.items():
if name == "__metadata__":
continue
if not isinstance(entry, dict) or "data_offsets" not in entry:
raise StructuralError(f"tensor {name!r} missing data_offsets")
offsets = entry["data_offsets"]
if not (isinstance(offsets, list) and len(offsets) == 2):
raise StructuralError(f"tensor {name!r} has malformed data_offsets")
try:
start, end = int(offsets[0]), int(offsets[1])
except (TypeError, ValueError) as e:
raise StructuralError(f"tensor {name!r} has non-integer data_offsets") from e
if start < 0 or end < start:
raise StructuralError(f"tensor {name!r} has invalid data_offsets")
data_len = max(data_len, end)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/model_downloader/verify/structural.py` around lines 44 - 58, Normalize
all malformed safetensors header cases in the structural check by raising
StructuralError from the verify logic in structural.py. In the header
parsing/validation path around the existing json.loads call and the loop over
header.items(), first ensure the decoded header is a dict before iterating, then
validate each entry’s data_offsets as a two-item list of integers with
non-negative, strictly increasing bounds. Convert any non-dict header,
non-integer offsets, negative values, or reversed ranges into a clear
StructuralError instead of letting AttributeError/ValueError escape or silently
accepting bad offsets.

Comment thread comfy/cli_args.py
Comment on lines +247 to +250
parser.add_argument("--download-segments", type=int, default=8, metavar="N", help="Number of parallel HTTP range segments per file for the model download manager (default: 8).")
parser.add_argument("--download-max-active", type=int, default=3, metavar="N", help="Maximum number of model downloads running concurrently (default: 3).")
parser.add_argument("--download-max-connections-per-host", type=int, default=16, metavar="N", help="Maximum simultaneous connections to a single host for the download manager (default: 16).")
parser.add_argument("--download-chunk-size", type=int, default=4 * 1024 * 1024, metavar="BYTES", help="Read chunk size in bytes for the download manager (default: 4 MiB).")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Reject non-positive downloader limits at the CLI boundary.

argparse will accept 0 and negative values here, which makes obviously invalid configs like --download-segments 0 or --download-chunk-size 0 reach the downloader.

Suggested fix
-parser.add_argument("--download-segments", type=int, default=8, metavar="N", help="Number of parallel HTTP range segments per file for the model download manager (default: 8).")
-parser.add_argument("--download-max-active", type=int, default=3, metavar="N", help="Maximum number of model downloads running concurrently (default: 3).")
-parser.add_argument("--download-max-connections-per-host", type=int, default=16, metavar="N", help="Maximum simultaneous connections to a single host for the download manager (default: 16).")
-parser.add_argument("--download-chunk-size", type=int, default=4 * 1024 * 1024, metavar="BYTES", help="Read chunk size in bytes for the download manager (default: 4 MiB).")
+parser.add_argument("--download-segments", type=_positive_int, default=8, metavar="N", help="Number of parallel HTTP range segments per file for the model download manager (default: 8).")
+parser.add_argument("--download-max-active", type=_positive_int, default=3, metavar="N", help="Maximum number of model downloads running concurrently (default: 3).")
+parser.add_argument("--download-max-connections-per-host", type=_positive_int, default=16, metavar="N", help="Maximum simultaneous connections to a single host for the download manager (default: 16).")
+parser.add_argument("--download-chunk-size", type=_positive_int, default=4 * 1024 * 1024, metavar="BYTES", help="Read chunk size in bytes for the download manager (default: 4 MiB).")
def _positive_int(value: str) -> int:
    parsed = int(value)
    if parsed <= 0:
        raise argparse.ArgumentTypeError("must be > 0")
    return parsed
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
parser.add_argument("--download-segments", type=int, default=8, metavar="N", help="Number of parallel HTTP range segments per file for the model download manager (default: 8).")
parser.add_argument("--download-max-active", type=int, default=3, metavar="N", help="Maximum number of model downloads running concurrently (default: 3).")
parser.add_argument("--download-max-connections-per-host", type=int, default=16, metavar="N", help="Maximum simultaneous connections to a single host for the download manager (default: 16).")
parser.add_argument("--download-chunk-size", type=int, default=4 * 1024 * 1024, metavar="BYTES", help="Read chunk size in bytes for the download manager (default: 4 MiB).")
parser.add_argument("--download-segments", type=_positive_int, default=8, metavar="N", help="Number of parallel HTTP range segments per file for the model download manager (default: 8).")
parser.add_argument("--download-max-active", type=_positive_int, default=3, metavar="N", help="Maximum number of model downloads running concurrently (default: 3).")
parser.add_argument("--download-max-connections-per-host", type=_positive_int, default=16, metavar="N", help="Maximum simultaneous connections to a single host for the download manager (default: 16).")
parser.add_argument("--download-chunk-size", type=_positive_int, default=4 * 1024 * 1024, metavar="BYTES", help="Read chunk size in bytes for the download manager (default: 4 MiB).")
def _positive_int(value: str) -> int:
parsed = int(value)
if parsed <= 0:
raise argparse.ArgumentTypeError("must be > 0")
return parsed
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@comfy/cli_args.py` around lines 247 - 250, Reject non-positive values at the
CLI boundary for the downloader limit flags in cli_args.py. Update the argparse
definitions for --download-segments, --download-max-active,
--download-max-connections-per-host, and --download-chunk-size to use a shared
positive-integer validator instead of plain int so values like 0 or negatives
are rejected before reaching the downloader. Add or reuse a helper such as
_positive_int near the parser setup and keep the existing defaults unchanged.

Comment thread openapi.yaml
Comment on lines +2528 to +2530
summary: List downloads
tags:
- download

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

Define the new download tag globally.

These operations use tags: [download], but download is missing from the root tags: list, so the OpenAPI lint step is already failing on this PR.

Minimal fix
 tags:
+    - description: Model download lifecycle and credential management
+      name: download
     - description: Workflow execution and management
       name: workflow
🧰 Tools
🪛 GitHub Actions: OpenAPI Lint / 0_Run Spectral.txt

[error] 2530-2530: spectral lint (rule: operation-tag-defined): Operation tags must be defined in global tags. (paths./api/download.get.tags[0])

🪛 GitHub Actions: OpenAPI Lint / Run Spectral

[error] 2530-2530: spectral lint error operation-tag-defined: Operation tags must be defined in global tags. (paths./api/download.get.tags[0])

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openapi.yaml` around lines 2528 - 2530, The OpenAPI spec is using the
download tag on list-downloads operations, but the tag is not declared in the
root tags collection. Update the top-level tags definition in openapi.yaml to
include a global download tag entry, keeping it consistent with the other tag
declarations so the lint step passes.

Source: Pipeline failures

Comment thread server.py
Comment on lines +1193 to +1199
def _notify(download_id: str) -> None:
try:
view = DOWNLOAD_MANAGER.status_sync(download_id)
if view is not None:
self.send_sync("download_progress", view)
except Exception:
logging.debug("download progress notify failed", exc_info=True)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Don't broadcast raw download URLs to every websocket client.

Line 1197 sends each status_sync() payload to all sockets (sid=None), and that view includes the download url. Any connected client can therefore see other users' source URLs and any signed query tokens embedded in them. Please redact the URL before notifying, or scope these events to the initiating session.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server.py` around lines 1193 - 1199, The _notify helper inside the download
progress flow is broadcasting the full status_sync() view, including the raw
download url, to every websocket client via send_sync with no session scoping.
Update the download notification path to either redact/remove the url field from
the view before calling send_sync or limit delivery to the initiating session,
using the _notify, DOWNLOAD_MANAGER.status_sync, and self.send_sync logic to
locate the change.

@alexisrolland alexisrolland added the cursor-review Trigger multi-model Cursor code review label Jun 29, 2026
@alexisrolland alexisrolland added cursor-review Trigger multi-model Cursor code review and removed cursor-review Trigger multi-model Cursor code review labels Jun 29, 2026

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Cursor Review — Consolidated panel

Triggered by @alexisrolland.

Found 10 finding(s).

Severity Count
🟠 High 5
🟡 Medium 5

Panel: 8/8 reviewers contributed findings.

async def resolve(self, host, port=0, family=socket.AF_INET):
infos = await self._inner.resolve(host, port, family)
# localhost/127.0.0.1 are an explicit, opt-in allowlist feature.
if isinstance(host, str) and host.lower() in LOOPBACK_HOSTS:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 HighValidatingResolver.resolve returns loopback resolutions unchecked for any name in LOOPBACK_HOSTS, and check_redirect_hop deliberately re-applies neither the host allowlist nor the extension check on redirect targets. Together, a directly-allowlisted http://127.0.0.1:<port>/x.safetensors (localhost is allowlisted for http by default with no port restriction) or an open redirect from an allowlisted origin to http://localhost/... lets any caller drive server-side GETs against arbitrary internal services (SSRF); an off-allowlist public redirect target can also substitute attacker-controlled bytes into a model path later deserialized via pickle. Constrain the loopback exemption (scheme/port) and/or re-validate redirect hosts. Raised by 4 of 8 reviewers (claude-opus-4-8-thinking-xhigh adversarial, gpt-5.3-codex-xhigh adversarial, gemini-3.1-pro adversarial, gemini-3.1-pro edge-case).

f"Model already exists on disk: {model_id}",
status=409,
)
if await self._has_live_download(model_id):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 High_has_live_download and insert_download are separate awaited steps with no uniqueness constraint on downloads.model_id, so two concurrent enqueues for the same model_id can both pass this check and create two jobs sharing one temp_path/dest_path that corrupt each other. resume() (manager.py:131) has no live-download check at all: enqueuing a new download while an older FAILED row for the same model_id exists and then resuming the old one admits both concurrently against the same file. Enforce single-live-per-model_id atomically (unique partial index or an in-process lock). Raised by 6 of 8 reviewers (gpt-5.3-codex-xhigh adversarial, claude-opus-4-8-thinking-xhigh adversarial, gemini-3.1-pro adversarial, gpt-5.3-codex-xhigh edge-case, claude-opus-4-8-thinking-xhigh edge-case, gemini-3.1-pro edge-case).


total = self.state.total_bytes
actual_size = os.path.getsize(self.spec.temp_path)
if total is not None and actual_size != total:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 High — For segmented transfers the file is preallocated to total_bytes, so this actual_size != total check always passes regardless of what was actually written. If any segment ends early (e.g. the server cleanly closes the connection mid-range), _run_segment returns without error and _run_segmented never verifies each segment reached seg.length, leaving zero-filled holes in a file accepted as complete (structural validation also passes since the size is right, unless expected_sha256 is set). Verify every segment wrote its full range before the atomic rename. Raised by 2 of 8 reviewers (gemini-3.1-pro adversarial, gemini-3.1-pro edge-case).

live = {
row.temp_path
for row in queries.list_downloads()
if row.status in (DownloadStatus.QUEUED, DownloadStatus.PAUSED)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 High_sweep_orphan_temp_files preserves .part files only for QUEUED/PAUSED rows and deletes them for FAILED, but FAILED is resumable (resume() re-queues it). After a restart a failed download's partial is deleted while its per-segment offsets remain in the DB, so resuming preallocates a fresh sparse file and skips every segment whose bytes_done == length, silently producing a model riddled with zero-filled holes (and the single-stream path leaves a leading hole). Add FAILED to the preserved set or clear persisted offsets when the temp file is removed. Raised by 2 of 8 reviewers (gemini-3.1-pro edge-case, gpt-5.3-codex-xhigh edge-case).

)

# Structural gate (cheap, no full read) then optional sha256 (full read).
await asyncio.to_thread(structural.validate, self.spec.temp_path)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Highstructural.validate raises StructuralError and checksum.verify_sha256 raises ChecksumError, both plain Exceptions, so in run() they fall through to the generic except Exception handler (job.py:178) and are re-queued as retryable. A truncated/corrupt or checksum-mismatched file is therefore re-verified up to _MAX_ATTEMPTS times and finally reported as 'giving up after N attempts' instead of failing closed — directly contradicting the FatalError docstring, which lists checksum/structural failures as non-retryable. Make both subclass FatalError (or catch them explicitly before the generic handler). Raised by 2 of 8 reviewers (claude-opus-4-8-thinking-xhigh adversarial, claude-opus-4-8-thinking-xhigh edge-case).

Comment thread server.py
try:
view = DOWNLOAD_MANAGER.status_sync(download_id)
if view is not None:
self.send_sync("download_progress", view)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Mediumdownload_progress is pushed via send_sync(..., sid=None), broadcasting each payload — which includes the verbatim url (also returned by GET /api/download and /api/download/{id} via _view, manager.py:199) — to every connected websocket client. Download URLs routinely carry secrets (Civitai ?token=, presigned CDN signatures), so one client's credentials leak to all other clients and into the INFO log/DB. Redact the query string with the existing redact_url helper before storing, returning, or broadcasting. Raised by 2 of 8 reviewers (gpt-5.3-codex-xhigh adversarial, claude-opus-4-8-thinking-xhigh adversarial).

raise RemoteChanged()
if resp.status not in (206,):
self._raise_for_status(resp.status)
async for chunk in resp.content.iter_chunked(args.download_chunk_size):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium — The transfer loops write whatever the server sends with no bound on the requested range or total size: a non-conforming 206 returning more than the requested bytes=offset-end is pwrite-n at ever-increasing offsets, overrunning adjacent segments and past the preallocated size, and the single-stream unknown-length path (end = -1) streams until the connection closes. A malicious or compromised allowlisted/redirect host can thus corrupt the file or fill the disk (DoS) before the post-hoc size check runs. Cap each write to the planned length and enforce a maximum download size. Raised by 2 of 8 reviewers (claude-opus-4-8-thinking-xhigh adversarial, gpt-5.3-codex-xhigh adversarial).

) as (resp, _final):
if offset > 0 and resp.status == 200:
# Resume not honoured -> start over from the beginning.
offset = 0

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium — When a resume request (offset > 0) gets HTTP 200, _run_single resets offset/bytes_done to 0 but never truncates the existing .part (opened O_RDWR|O_CREAT, no O_TRUNC). If the restarted content is shorter than the prior partial, stale trailing bytes survive past the new end, yielding a too-large file that fails the _finalize size check (or trailing garbage when total_bytes is unknown) for a download that should have restarted cleanly. Truncate the file to 0 before restarting. Raised by 4 of 8 reviewers (gpt-5.3-codex-xhigh adversarial, gpt-5.3-codex-xhigh edge-case, claude-opus-4-8-thinking-xhigh edge-case, gemini-3.1-pro edge-case).

else:
# Single-stream: one logical segment; bytes_done tracked on the row.
row = queries.get_download(self.spec.download_id)
resume_from = row.bytes_done if row else 0

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium — In the single-stream branch resume_from = row.bytes_done, which is the SUM of all per-segment offsets, and it is used as a contiguous prefix length without validating the actual .part file. If a download previously ran segmented (non-contiguous preallocated data) but now resolves to one segment (server stops advertising Accept-Ranges, or --download-segments is lowered between runs), it resumes writing at the summed offset over a non-contiguous file and produces corrupt output. Validate the on-disk file length/layout before trusting persisted offsets. Raised by 3 of 8 reviewers (claude-opus-4-8-thinking-xhigh edge-case, gpt-5.3-codex-xhigh edge-case, gemini-3.1-pro edge-case).

now = time.monotonic()
if not force and now - self._last_persist < _PERSIST_INTERVAL:
if self._notify:
self._notify(self.spec.download_id)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium_persist_progress calls self._notify on every chunk even inside the throttle branch, bypassing _PERSIST_INTERVAL; with a small --download-chunk-size or many concurrent segments this broadcasts to all websocket clients far more often than intended. Separately, the queries.update_download/update_segment_progress writes here (and in _set_status/_probe_and_plan) run synchronous SQLite directly on the event loop rather than via asyncio.to_thread as the module docstrings require, so they can stall the web server. Throttle the notify and offload the DB writes. Raised by 4 of 8 reviewers (gemini-3.1-pro adversarial, gpt-5.3-codex-xhigh adversarial, claude-opus-4-8-thinking-xhigh edge-case, gemini-3.1-pro edge-case).

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Cursor Review — Consolidated panel

Triggered by @alexisrolland.

Found 10 finding(s).

Severity Count
🟠 High 2
🟡 Medium 8

Panel: 6/8 reviewers contributed findings.

Reviewers that did not contribute: gpt-5.3-codex-xhigh:adversarial (empty), gpt-5.3-codex-xhigh:edge-case (parse_error)

)

# Structural gate (cheap, no full read) then optional sha256 (full read).
await asyncio.to_thread(structural.validate, self.spec.temp_path)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 High_finalize calls structural.validate(self.spec.temp_path), but temp_path ends in the .comfy-download.part suffix, so validate's extension check never matches .safetensors/.sft and silently skips validation for every download — the structural integrity gate is dead code. Pass self.spec.dest_path (or its extension) to determine the format. The segmented integration test only passes because random bytes written to a .safetensors destination are never actually validated. Raised by 1 of 8 reviewers (gemini-3.1-pro edge-case).


total = self.state.total_bytes
actual_size = os.path.getsize(self.spec.temp_path)
if total is not None and actual_size != total:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 High — For segmented downloads the file is preallocated to total_bytes via os.ftruncate, so os.path.getsize(temp_path) always equals total and the actual_size != total check can never fire. A segment that ends short (truncated 206 / server closes early) thus leaves a zero-filled hole that passes verification and is cataloged. Verify self.state.bytes_done == total rather than the on-disk size. Raised by 1 of 8 reviewers (gemini-3.1-pro edge-case).

async def resolve(self, host, port=0, family=socket.AF_INET):
infos = await self._inner.resolve(host, port, family)
# localhost/127.0.0.1 are an explicit, opt-in allowlist feature.
if isinstance(host, str) and host.lower() in LOOPBACK_HOSTS:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MediumValidatingResolver.resolve exempts any hop whose host is literally localhost/127.0.0.1/::1 from IP filtering, and check_redirect_hop deliberately does not re-check the host allowlist. An allowlisted origin (or a host added via --download-allowed-hosts) that 30x-redirects to http://localhost:<port>/... therefore reaches arbitrary loopback-bound internal services. Only exempt loopback when the initial user-supplied URL was itself a loopback host. Raised by 2 of 8 reviewers (claude-opus-4-8-thinking-xhigh adversarial, gemini-3.1-pro edge-case).

except ValueError:
return True # unparseable -> refuse
return (
ip.is_private

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Mediumis_blocked_ip relies on ipaddress properties that, on CPython before the 3.13 fix (backported to 3.12.4/3.11.9/3.10.14/3.9.19), do not classify IPv4-mapped IPv6 addresses such as ::ffff:169.254.169.254 or ::ffff:127.0.0.1 as private/loopback/link-local. On those still-common interpreters a redirect-target host whose AAAA record is an IPv4-mapped address bypasses the SSRF filter and can hit the cloud metadata endpoint. Resolve ip.ipv4_mapped and re-check before returning. Raised by 1 of 8 reviewers (gemini-3.1-pro adversarial).

self._raise_for_status(resp.status)
async for chunk in resp.content.iter_chunked(args.download_chunk_size):
self._check_control()
await self._writer.write_at(offset, chunk)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium — The read loop in _run_segment writes every chunk from iter_chunked without bounding to seg.end. A misbehaving server/proxy that returns more bytes than the requested 206 range overruns into the adjacent segment's region of the preallocated file, silently corrupting data; the size check passes because the file length is unchanged. Cap each write so seg.bytes_done + len(chunk) <= seg.length. Raised by 3 of 8 reviewers (gemini-3.1-pro adversarial, claude-opus-4-8-thinking-xhigh edge-case, gemini-3.1-pro edge-case).

live = {
row.temp_path
for row in queries.list_downloads()
if row.status in (DownloadStatus.QUEUED, DownloadStatus.PAUSED)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium_sweep_orphan_temp_files preserves .part files only for QUEUED/PAUSED rows, so on restart it deletes the partial of any retry-exhausted FAILED download — which resume() explicitly permits resuming. After resume, _probe_and_plan reuses the persisted bytes_done/segment offsets and writes into a freshly O_CREAT'd file at those offsets, leaving a zero-filled hole that passes the size-only gate (corruption). Preserve FAILED partials, or reset bytes_done/segments when the .part is absent. Raised by 2 of 8 reviewers (gemini-3.1-pro edge-case, kimi-k2.5 edge-case).

async def write_at(self, offset: int, data: bytes) -> None:
assert self._fd is not None, "writer not opened"
await asyncio.get_running_loop().run_in_executor(
_EXECUTOR, os.pwrite, self._fd, data, offset

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MediumFileWriter.write_at ignores the return value of os.pwrite, which can write fewer bytes than len(data) (e.g. on signal interruption or near ENOSPC), while the caller advances offset/bytes_done by the full chunk length. This leaves a gap in the file yet over-counts progress; for formats with no structural check and no expected_sha256 the corrupt file passes verification. Loop until all bytes are written. Raised by 2 of 8 reviewers (claude-opus-4-8-thinking-xhigh adversarial, gemini-3.1-pro edge-case).

self._notify(self.spec.download_id)
return
self._last_persist = now
queries.update_download(self.spec.download_id, bytes_done=self.state.bytes_done)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium_persist_progress, _set_status, _probe_and_plan, and _reset_for_restart call the synchronous SQLite queries.* helpers directly on the event loop, contrary to the module contract that they be invoked via asyncio.to_thread. This blocks the ComfyUI loop on every progress persist during transfers and can contend ("database is locked") with the manager's concurrent to_thread DB writers. Raised by 2 of 8 reviewers (claude-opus-4-8-thinking-xhigh edge-case, gemini-3.1-pro edge-case).

f"Model already exists on disk: {model_id}",
status=409,
)
if await self._has_live_download(model_id):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium_has_live_download (line 83) and insert_download (line 91) are separated by awaits with no DB uniqueness constraint on model_id, so two near-simultaneous enqueue calls for the same model_id can both pass the check. Both jobs then open and write the same deterministic temp_path/dest_path at overlapping offsets, corrupting the partial and racing on the final os.replace. Raised by 2 of 8 reviewers (claude-opus-4-8-thinking-xhigh adversarial, claude-opus-4-8-thinking-xhigh edge-case).

"allow_any_extension": allow_any_extension,
},
)
logging.info("[model_downloader] enqueued %s -> %s", url, model_id)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium — The raw user-supplied url is logged in full here, persisted verbatim in downloads.url (line 95), and echoed back by the status/list/availability views — bypassing the redact_url helper used elsewhere precisely to strip query strings. A Civitai-style ?token= secret (or userinfo credentials) in the URL is therefore leaked to logs and API responses. Redact the query string before logging and storing. Raised by 1 of 8 reviewers (claude-opus-4-8-thinking-xhigh adversarial).

@JoeGaffney

Copy link
Copy Markdown

One question: could Hugging Face authentication also support HF_TOKEN as a first-class option?

For server-side and Docker deployments, it's already a common pattern to configure HF_TOKEN ahead of time via the environment, so users already have credentials available without needing to configure them again through the UI. In more controlled deployments, that's often preferable anyway, as you may not want end users managing or supplying their own credentials.

Supporting HF_TOKEN would avoid requiring stored credentials for those deployments, while the UI credential storage could remain an optional convenience for users who don't already have a token configured.

Also, there's another open PR (#14586) tackling a very similar problem with a different authentication approach. It might be worth aligning on a single authentication story before the implementations diverge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cursor-review Trigger multi-model Cursor code review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants