Skip to content

feat: make asset hashing opt-in via --enable-asset-hashing (off by default)#14663

Open
mattmillerai wants to merge 3 commits into
masterfrom
matt/be-788-enable-asset-hashing
Open

feat: make asset hashing opt-in via --enable-asset-hashing (off by default)#14663
mattmillerai wants to merge 3 commits into
masterfrom
matt/be-788-enable-asset-hashing

Conversation

@mattmillerai

Copy link
Copy Markdown
Contributor

ELI-5

ComfyUI can fingerprint ("hash") every model and output file so it can later tell when two files are really the same. That hashing reads every byte of every file with blake3, which is slow the first time you start up with a big models/ folder. This PR makes that hashing optional and off by default when you run locally: nothing gets hashed unless you ask for it with a new --enable-asset-hashing flag. Asset registration still happens — only the expensive hashing step is gated.

What changed

  • Added --enable-asset-hashing to comfy/cli_args.py (action="store_true", default False). The help text explains the trade-off: hashing enables future asset-portability features (dedup, cross-machine model resolution) but adds startup + per-output cost on large models directories.
  • Plumbed the flag into the two call sites in main.py that previously hardcoded compute_hashes=True:
    • the startup scan (asset_seeder.start(...))
    • the post-job output enqueue (asset_seeder.enqueue_enrich(...))

Both now pass compute_hashes=args.enable_asset_hashing. No other behavior changes — this is purely the toggle. The compute_hashes parameter already existed throughout app/assets/seeder.py (defaults False); this PR only adds the user-facing switch and the right default.

Why

A comprehensive blake3 pass over a user's models/ directory can feel like a UX regression on slower machines. The decision was to ship registration without hashing locally and let users opt in. This ships only the mechanism, off by default; whether to ever flip the local default to on is a separate, later decision.

Verification

  • --enable-asset-hashing parses to True; absence parses to False (default) — confirmed against the argparse parser.
  • --help renders the new flag with its description.
  • ruff check passes on both changed files.
  • Note: the full Python test suite could not be executed in this environment (only Python 3.9 was available, but the repo requires 3.10+ for X | None annotations, and torch/sqlalchemy deps were absent). The change is mechanical (one store_true flag + swapping a hardcoded True for the flag at two sites) and there are no existing tests covering main.py's startup wiring or cli_args; seeder.py itself is untouched.

Judgment calls

  • Flag naming: the source ticket left open whether to call this --enable-asset-hashing or --enable-hashing. I used --enable-asset-hashing to stay consistent with the companion PR already in review that uses that exact name. If the team prefers --enable-hashing, both sides should be renamed together before merge.
  • --enable-assets left untouched: removing that flag is tracked separately; this PR is scoped strictly to the hashing toggle.

Merge dependency

The companion cloud-side launch-config wiring must be in flight before this merges — otherwise cloud silently stops hashing the moment this default takes effect (breaking dedup and cross-pod model resolution). Do not merge ahead of it.

…fault

Add a --enable-asset-hashing CLI flag (action=store_true, default False)
and plumb it into the two asset-seeder call sites in main.py that
previously hardcoded compute_hashes=True (the startup scan and the
post-job output enqueue). Local runs now skip blake3 hashing unless the
user opts in, avoiding the startup/per-output cost on large models
directories while keeping hashing available for asset-portability
features.
@mattmillerai mattmillerai added cursor-review Trigger multi-model Cursor code review agent-coded labels Jun 28, 2026
@mattmillerai mattmillerai marked this pull request as ready for review June 28, 2026 03:46
@alexisrolland alexisrolland self-assigned this Jun 28, 2026
@alexisrolland alexisrolland self-requested a review June 28, 2026 03:47
@coderabbitai

coderabbitai Bot commented Jun 28, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

A new CLI flag --enable-asset-hashing is added to comfy/cli_args.py as a store_true boolean option. In main.py, the two calls to asset_seeder (enqueue_enrich and start) are updated to pass compute_hashes=args.enable_asset_hashing instead of the previously hardcoded compute_hashes=True, making blake3 content hash computation opt-in and off by default.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: opt-in asset hashing via a new CLI flag.
Description check ✅ Passed The description matches the implemented changes and explains the new flag and wiring accurately.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@main.py`:
- Line 406: The call to asset_seeder.enqueue_enrich is using
args.enable_asset_hashing for compute_hashes, but that parameter also controls
enrichment depth in app/assets/seeder.py. Update enqueue_enrich and its caller
so hashing can be toggled independently from metadata enrichment, preserving
ENRICHMENT_METADATA for startup/output assets even when hashing is off. Apply
the same fix at both enqueue_enrich call sites and use a dedicated argument or
separate flag for hash generation instead of overloading compute_hashes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 0b9fed89-8361-4ee9-b164-7f10736deb01

📥 Commits

Reviewing files that changed from the base of the PR and between a95e461 and a5ad395.

📒 Files selected for processing (2)
  • comfy/cli_args.py
  • main.py

Comment thread main.py
@mattmillerai mattmillerai added cursor-review Trigger multi-model Cursor code review and removed cursor-review Trigger multi-model Cursor code review labels Jun 29, 2026

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Cursor Review — Consolidated panel

Triggered by @mattmillerai.

✅ No high-signal findings.

Panel: 8/8 reviewers contributed findings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-coded cursor-review Trigger multi-model Cursor code review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants