🚧 WIP — DO NOT USE
This repository is under active development and is not ready for general use. APIs, behavior, and files may change without notice.
Text-to-Image ComfyUI node for BAAI Emu3.5.
Based on baaivision/Emu3.5.
- Auto-download Hugging Face repos to ComfyUI models folder:
BAAI/Emu3.5(default) orBAAI/Emu3.5-ImageBAAI/Emu3.5-VisionTokenizer
- CUDA fp16 by default; default device
cuda:0. - Optional offload via
device_map="auto"toggle. - Sequential batching with
base_seed + index. - Prompt-conditioned image-to-image node that reuses reference frames via Emu3.5's VQ tokenizer.
- Model:
BAAI/Emu3.5by default (you can switch toBAAI/Emu3.5-Image). - Device:
cuda:0by default. - Precision: fp16 by default (bf16 optional). If flash-attn isn’t available on Windows, the node falls back to SDPA attention automatically.
- Offload: visible toggle that sets
device_map="auto"for HF accelerate offload when VRAM is tight.
# Inside your ComfyUI environment
pip install -r .\requirements.txt # from the repo root (Emu3.5 requirements)
pip install -r .\comfyui_emu35_node\requirements.txt # adds huggingface_hub# Copy this folder to your ComfyUI custom_nodes
# Replace the path with your ComfyUI install
Copy-Item -Recurse -Force .\comfyui_emu35_node "C:\path\to\ComfyUI\custom_nodes\comfyui_emu35_node"-
Copy or symlink
comfyui_emu35_node/into your%COMFYUI_ROOT%/custom_nodes/. -
Launch ComfyUI. Look under the "emu3.5" category for:
- "Emu3.5 Load (fp16)"
- "Emu3.5 T2I (Batch)"
- "Emu3.5 I2I (Batch)"
-
On first run, the node will download the selected model(s) into
%COMFYUI_ROOT%/models/Emu3.5/.
You can import a ready-made workflow graph:
# In ComfyUI, use "Load" and select this file from the repo
%CD%\comfyui_emu35_node\workflows\emu35_t2i_example.jsonThat workflow:
- Loads
BAAI/Emu3.5oncuda:0, fp16; offload disabled by default - Runs “Emu3.5 T2I (Batch)” with a sample prompt, batch=1, guidance=2.0, and default sampling knobs
- Saves the image using the standard
SaveImagenode
-
Emu3.5 Load (fp16)
model_repo: string; defaultBAAI/Emu3.5(you can switch toBAAI/Emu3.5-Image).precision:fp16(default) orbf16.device: string; defaultcuda:0.offload: boolean; when true, usesdevice_map="auto"offload.attn_backend:auto(default),flash_attn, orsdpa.autotries flash-attn, falls back to SDPA.
-
Emu3.5 T2I (Batch)
prompt: text prompt (multiline supported).num_images: batch size; runs sequentially.base_seed: base seed; image i usesbase_seed + ifor reproducible variety.guidance: classifier-free guidance scale (default 2.0; consider ~5.0 forEmu3.5-Image).unconditional_type:no_text(default) orno_text_img_cfg.image_cfg_scale: extra image CFG weight whenno_text_img_cfgis used.max_new_tokens: cap on generated tokens; default 32768.- Differential sampling knobs (defaults mirror the repo):
- Text:
text_top_k,text_top_p,text_temperature - Image:
image_top_k,image_top_p,image_temperature
- Text:
- Not applicable: negative prompts, steps/samplers (this is an AR model, not diffusion).
-
Emu3.5 I2I (Batch)
reference_image: ComfyUI image input (first batch frame is used).prompt: guidance text applied alongside the reference frame.image_area: target pixel area for VQ encoding (default 720×720); larger values preserve detail but cost VRAM.- Shares the remaining controls with the T2I node; defaults skew toward
no_text_img_cfgguidance (5.0 CFG, 1.5 image CFG).
COMFYUI_MODELS_DIR: Override ComfyUI models root detection.EMU35_MODEL_DIR: Full path to a local Emu3.5 model directory.EMU35_VQ_DIR: Full path to a local Emu3.5-VisionTokenizer directory.
- Negative prompt strings and "steps/sampler" are not applicable to this AR pipeline.
- T2I resolution is decided by the model; no explicit width/height inputs.
- VRAM for 34B models is large; consider enabling offload (
device_map=auto) if you run out of memory. - If
flash-attnis unavailable on Windows, the node will fall back to SDPA attention.
- If running under ComfyUI, weights download into
%COMFYUI_ROOT%/models/Emu3.5/..../Emu3.5/Emu3.5or.../Emu3.5/Emu3.5-Image.../Emu3.5/Emu3.5-VisionTokenizer
- If not in ComfyUI, the node falls back to a local
./models/folder in this repo. - You can point to existing local folders using the environment overrides above.
- Missing dependency: if
huggingface_hubis not installed, run the install commands in “Install deps”. - Flash-attn issues on Windows: select
attn_backend=sdpain the loader or keepauto(it falls back to SDPA automatically). - Device not found: if the requested CUDA device index isn’t available, the loader will use
cuda:0. You can setdeviceexplicitly. - OOM on large GPUs: enable
offload(device_map=auto), reducenum_imagesto 1, reducemax_new_tokens, or use the baseEmu3.5model instead ofEmu3.5-Image.
- Image-to-Image support using the repo’s VQ encode path.
- Secondary text outputs (e.g., global_cot/image_cot) as optional outputs.