diff --git a/docs/superpowers/plans/2026-05-10-blind-relation-gate-and-track1-champion-eval.md b/docs/superpowers/plans/2026-05-10-blind-relation-gate-and-track1-champion-eval.md
new file mode 100644
index 00000000..8d471b01
--- /dev/null
+++ b/docs/superpowers/plans/2026-05-10-blind-relation-gate-and-track1-champion-eval.md
@@ -0,0 +1,421 @@
+# Blind Relation Gate And Track1 Champion Eval Implementation Plan
+
+> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
+
+**Goal:** Add a generic blind relation gate to Vulca so relation semantics are judged from the image before caption anchoring, then use it to harden AffectiveArt Track1 candidate evaluation.
+
+**Architecture:** Vulca keeps the reusable capability: a blind image-only relation read, a deterministic relation comparator, and content-fidelity score capping when the blind read contradicts required relations. The challenge repo remains a consumer: it runs champion/candidate audits and only accepts replacements that are clear wins under caption fidelity, artifact boundary, style, emotion, and blind relation checks.
+
+**Tech Stack:** Python 3, pytest, LiteLLM/Gemini VLM scoring path, existing `vulca.content_lock` and `vulca.pipeline.nodes.evaluate` modules, existing Track1 audit scripts in `/Users/yhryzy/dev/emoart-130k`.
+
+---
+
+## File Structure
+
+- Modify: `src/vulca/content_lock.py`
+ - Add blind-relation prompt construction.
+ - Add deterministic blind-relation gate construction.
+ - Extend `apply_content_fidelity_gate` to cap high scores when blind relation decision is `reject` or `hold`.
+- Modify: `src/vulca/_vlm.py`
+ - Add a second image-only VLM call for content locks with required relations.
+ - Merge blind relation gate output into `content_fidelity_gate`.
+ - Keep primary scoring usable if the blind VLM call fails.
+- Modify: `tests/test_content_lock.py`
+ - Unit-test blind prompt non-anchoring, reject/hold/pass gate decisions, and score cap behavior.
+- Modify: `tests/test_evaluate.py` or `tests/test_vlm_prompt.py`
+ - Integration-test that `score_image`/`EvaluateNode` propagates blind gate metadata without requiring live network.
+- Read-only consumer: `/Users/yhryzy/dev/emoart-130k/scripts/track1_quality_review.py`
+ - Use existing heuristic/VLM review first. Do not mutate Track1 submission packages during this plan.
+
+## Task 1: Content-Lock Blind Relation Helpers
+
+**Files:**
+- Modify: `src/vulca/content_lock.py`
+- Test: `tests/test_content_lock.py`
+
+- [ ] **Step 1: Write failing tests**
+
+Add tests that express the API before implementation:
+
+```python
+from vulca.content_lock import (
+ build_blind_relation_gate,
+ build_blind_relation_read_prompt,
+)
+
+
+def test_blind_relation_prompt_does_not_anchor_on_caption_or_forbidden_reading():
+ lock = extract_content_lock(
+ "Wartime illustration of mounted soldiers beside fleeing civilians, "
+ "burning village ruins, and aircraft overhead."
+ )
+
+ prompt = build_blind_relation_read_prompt(lock)
+
+ assert "caption" not in prompt.lower()
+ assert "escort" not in prompt.lower()
+ assert "protect" not in prompt.lower()
+ assert "soldiers chasing civilians" not in prompt.lower()
+ assert "visible relationships" in prompt.lower()
+
+
+def test_blind_relation_gate_rejects_forbidden_primary_reading():
+ lock = extract_content_lock(
+ "Wartime illustration of mounted soldiers beside fleeing civilians, "
+ "burning village ruins, and aircraft overhead."
+ )
+
+ gate = build_blind_relation_gate(
+ lock,
+ {
+ "primary_reading": "Mounted soldiers appear to chase fleeing civilians.",
+ "apparent_relations": ["mounted soldiers chasing civilians"],
+ "ambiguous_readings": [],
+ },
+ )
+
+ assert gate["blind_relation_decision"] == "reject"
+ assert "soldiers chasing civilians" in gate["blind_forbidden_readings_present"]
+
+
+def test_blind_relation_gate_holds_ambiguous_relation_reading():
+ lock = extract_content_lock(
+ "Wartime illustration of mounted soldiers beside fleeing civilians, "
+ "burning village ruins, and aircraft overhead."
+ )
+
+ gate = build_blind_relation_gate(
+ lock,
+ {
+ "primary_reading": "The riders could be escorting or pursuing the civilians.",
+ "apparent_relations": ["riders behind fleeing civilians"],
+ "ambiguous_readings": ["escort or pursuit"],
+ },
+ )
+
+ assert gate["blind_relation_decision"] == "hold"
+ assert gate["blind_ambiguous_readings"] == ["escort or pursuit"]
+
+
+def test_blind_relation_gate_passes_clear_escort_reading():
+ lock = extract_content_lock(
+ "Wartime illustration of mounted soldiers beside fleeing civilians, "
+ "burning village ruins, and aircraft overhead."
+ )
+
+ gate = build_blind_relation_gate(
+ lock,
+ {
+ "primary_reading": "Mounted soldiers flank civilians and guide them away from burning ruins.",
+ "apparent_relations": ["mounted soldiers guiding civilians away from burning ruins"],
+ "ambiguous_readings": [],
+ },
+ )
+
+ assert gate["blind_relation_decision"] == "pass"
+```
+
+- [ ] **Step 2: Run tests and verify RED**
+
+Run:
+
+```bash
+PYTHONPATH=src pytest tests/test_content_lock.py -k "blind_relation" -q
+```
+
+Expected: fail because `build_blind_relation_gate` and `build_blind_relation_read_prompt` are not defined.
+
+- [ ] **Step 3: Implement minimal helpers**
+
+Add functions to `src/vulca/content_lock.py`:
+
+```python
+def build_blind_relation_read_prompt(lock: ContentLock | dict[str, Any]) -> str:
+ content_lock = content_lock_from_dict(lock) if isinstance(lock, dict) else lock
+ if not content_lock.required_relations:
+ return ""
+ return "\n".join(
+ [
+ "BLIND IMAGE RELATION READ:",
+ "Describe only what is visible in the image. Do not use any external caption, prompt, sample id, filename, or expected story.",
+ "Focus on visible relationships among people, animals, vehicles, objects, threats, movement direction, gaze, weapons, gestures, and protection cues.",
+ "Return exactly one JSON object with these fields:",
+ '"visible_entities": [short strings],',
+ '"primary_reading": "one sentence describing the most natural visible relationship reading",',
+ '"apparent_relations": [short subject-relation-object strings visible in the image],',
+ '"threat_cues": [short strings],',
+ '"protective_cues": [short strings],',
+ '"ambiguous_readings": [short strings for plausible alternate readings, or empty list],',
+ '"confidence": number from 0.0 to 1.0.',
+ ]
+ )
+
+
+def build_blind_relation_gate(
+ lock: ContentLock | dict[str, Any],
+ blind_read: dict[str, Any] | None,
+) -> dict[str, Any]:
+ content_lock = content_lock_from_dict(lock) if isinstance(lock, dict) else lock
+ if not content_lock.required_relations:
+ return {"blind_relation_decision": "not_applicable"}
+ if not blind_read:
+ return {
+ "blind_relation_decision": "unavailable",
+ "blind_relation_reason": "blind relation read unavailable",
+ }
+ primary = str(blind_read.get("primary_reading") or "")
+ apparent = _as_string_list(blind_read.get("apparent_relations"))
+ ambiguous = _as_string_list(blind_read.get("ambiguous_readings"))
+ joined = " ".join([primary, *apparent]).lower()
+ forbidden_present = [
+ reading
+ for reading in content_lock.forbidden_readings
+ if _relation_reading_matches(reading, joined)
+ ]
+ decision = "pass"
+ reason = "blind read did not contradict required relations"
+ if forbidden_present:
+ decision = "reject"
+ reason = "blind read matches forbidden relation reading"
+ elif ambiguous:
+ decision = "hold"
+ reason = "blind read is ambiguous for required relations"
+ return {
+ "blind_relation_decision": decision,
+ "blind_relation_reason": reason,
+ "blind_primary_reading": primary,
+ "blind_apparent_relations": apparent,
+ "blind_ambiguous_readings": ambiguous,
+ "blind_forbidden_readings_present": forbidden_present,
+ }
+```
+
+Also add `_relation_reading_matches(reading: str, joined: str) -> bool` with conservative matching for `chasing`, `attacking`, `threatened`.
+
+- [ ] **Step 4: Run tests and verify GREEN**
+
+Run:
+
+```bash
+PYTHONPATH=src pytest tests/test_content_lock.py -k "blind_relation" -q
+```
+
+Expected: all selected tests pass.
+
+## Task 2: Score Cap For Blind Reject/Hold
+
+**Files:**
+- Modify: `src/vulca/content_lock.py`
+- Test: `tests/test_content_lock.py`
+
+- [ ] **Step 1: Write failing test**
+
+Add:
+
+```python
+def test_blind_relation_reject_caps_high_score():
+ result = {
+ "scores": {"L1": 0.95, "L2": 0.92, "L3": 1.0, "L4": 1.0, "L5": 0.94},
+ "weighted_total": 0.965,
+ "rationales": {},
+ }
+ gate = {
+ "blind_relation_decision": "reject",
+ "blind_forbidden_readings_present": ["soldiers chasing civilians"],
+ "blind_primary_reading": "Mounted soldiers appear to chase civilians.",
+ }
+
+ gated = apply_content_fidelity_gate(result, gate)
+
+ assert gated["weighted_total"] == 0.25
+ assert "Blind relation gate rejected image" in gated["rationales"]["content_fidelity"]
+ assert "content_fidelity_failed" in gated["risk_flags"]
+```
+
+- [ ] **Step 2: Run test and verify RED**
+
+Run:
+
+```bash
+PYTHONPATH=src pytest tests/test_content_lock.py::test_blind_relation_reject_caps_high_score -q
+```
+
+Expected: fail because current `apply_content_fidelity_gate` ignores `blind_relation_decision`.
+
+- [ ] **Step 3: Implement score cap**
+
+In `apply_content_fidelity_gate`, read:
+
+```python
+blind_relation_decision = str(gate.get("blind_relation_decision") or "")
+blind_relation_failed = blind_relation_decision in {"reject", "hold"}
+```
+
+Include `blind_relation_failed` in the cap condition and add rationale:
+
+```python
+if blind_relation_failed:
+ rationale_parts.append(
+ f"Blind relation gate rejected image: {gate.get('blind_relation_reason') or blind_relation_decision}"
+ )
+```
+
+- [ ] **Step 4: Run focused tests**
+
+Run:
+
+```bash
+PYTHONPATH=src pytest tests/test_content_lock.py -k "blind_relation or relation_semantics" -q
+```
+
+Expected: selected tests pass.
+
+## Task 3: VLM Blind Read Integration
+
+**Files:**
+- Modify: `src/vulca/_vlm.py`
+- Test: `tests/test_vlm_prompt.py` or `tests/test_evaluate.py`
+
+- [ ] **Step 1: Write failing integration test**
+
+Patch `litellm.acompletion` with two responses: the normal caption-conditioned score and the blind relation read. Assert the returned `content_fidelity_gate` includes `blind_relation_decision="reject"` when the blind read says pursuit.
+
+```python
+@pytest.mark.asyncio
+async def test_score_image_adds_blind_relation_gate_for_required_relations(monkeypatch):
+ from vulca._vlm import score_image
+ from vulca.content_lock import extract_content_lock
+
+ lock = extract_content_lock(
+ "Wartime illustration of mounted soldiers beside fleeing civilians, "
+ "burning village ruins, and aircraft overhead."
+ )
+
+ normal_response = _completion_response(
+ '{"L1":0.9,"L2":0.9,"L3":0.9,"L4":0.9,"L5":0.9,'
+ '"L1_rationale":"ok","L2_rationale":"ok","L3_rationale":"ok","L4_rationale":"ok","L5_rationale":"ok",'
+ '"missing_required_subjects":[],"missing_required_text_elements":[],"missing_required_surface":[],'
+ '"missing_required_style_attributes":[],"apparent_relations":["caption-conditioned escort"],'
+ '"relation_semantics_failed":false,"forbidden_readings_present":[],'
+ '"forbidden_visual_artifacts":[],"unwanted_visible_text":false,"output_is_artwork_itself":true,'
+ '"risk_flags":[]}'
+ )
+ blind_response = _completion_response(
+ '{"visible_entities":["mounted soldiers","civilians"],'
+ '"primary_reading":"Mounted soldiers appear to chase fleeing civilians.",'
+ '"apparent_relations":["mounted soldiers chasing civilians"],'
+ '"threat_cues":[],"protective_cues":[],"ambiguous_readings":[],"confidence":0.82}'
+ )
+ calls = [normal_response, blind_response]
+
+ async def fake_completion(**kwargs):
+ return calls.pop(0)
+
+ monkeypatch.setattr("litellm.acompletion", fake_completion)
+
+ result = await score_image(
+ img_b64="iVBORw0KGgo=",
+ mime="image/png",
+ subject="track1_0747",
+ tradition="default",
+ api_key="fake-key",
+ content_lock=lock.to_dict(),
+ )
+
+ gate = result["content_fidelity_gate"]
+ assert gate["blind_relation_decision"] == "reject"
+ assert gate["blind_forbidden_readings_present"] == ["soldiers chasing civilians"]
+```
+
+- [ ] **Step 2: Run test and verify RED**
+
+Run:
+
+```bash
+PYTHONPATH=src pytest tests/test_vlm_prompt.py -k "blind_relation_gate" -q
+```
+
+Expected: fail because `score_image` does not make a blind relation read yet.
+
+- [ ] **Step 3: Implement integration**
+
+In `src/vulca/_vlm.py`, after parsing the normal VLM scoring JSON and before returning data:
+
+```python
+blind_relation_gate = None
+if resolved_content_lock is not None and resolved_content_lock.required_relations:
+ from vulca.content_lock import build_blind_relation_gate
+
+ blind_read = await _blind_relation_read(
+ img_b64=img_b64,
+ mime=mime,
+ api_key=api_key,
+ model=model,
+ api_base=api_base,
+ )
+ blind_relation_gate = build_blind_relation_gate(resolved_content_lock, blind_read)
+```
+
+Merge `blind_relation_gate` into `content_fidelity_gate` when present. Add `_blind_relation_read(...)` that calls LiteLLM with the image and `build_blind_relation_read_prompt`, then parses JSON with `parse_llm_json`. On exception, return `{"_error": str(exc)}` and let `build_blind_relation_gate` decide `unavailable`.
+
+- [ ] **Step 4: Run focused tests**
+
+Run:
+
+```bash
+PYTHONPATH=src pytest tests/test_content_lock.py tests/test_vlm_prompt.py -k "blind_relation or relation_semantics or content_fidelity" -q
+```
+
+Expected: selected tests pass.
+
+## Task 4: Challenge Evaluation Pass
+
+**Files:**
+- Read: `/Users/yhryzy/dev/emoart-130k/submissions/track1_submission.json`
+- Read: `/Users/yhryzy/dev/emoart-130k/submissions/track1_candidate_v2/submission.json`
+- Read/write reports under `/Users/yhryzy/dev/emoart-130k/experiments/track1_champion_quality_review_*_20260510/`
+
+- [ ] **Step 1: Run heuristic full-package scan**
+
+Run:
+
+```bash
+python3 scripts/track1_quality_review.py --image-dir submissions/track1/images --out-dir experiments/track1_champion_quality_review_current_20260510 --heuristic-only
+python3 scripts/track1_quality_review.py --image-dir submissions/track1_candidate_v2/images --out-dir experiments/track1_champion_quality_review_candidate_v2_20260510 --heuristic-only
+```
+
+Expected: each command writes `heuristic_risk_rank.json`, `quality_review_report.json`, and `quality_review_report.md`.
+
+- [ ] **Step 2: Run live VLM review on top-risk samples**
+
+Use the Gemini key from Keychain without printing it:
+
+```bash
+GEMINI_API_KEY="$(security find-generic-password -s affectiveart-gemini-api-key -a gemini -w)" python3 scripts/track1_quality_review.py --image-dir submissions/track1/images --out-dir experiments/track1_champion_quality_review_current_20260510 --model gemini-3-flash-preview --limit 40
+GEMINI_API_KEY="$(security find-generic-password -s affectiveart-gemini-api-key -a gemini -w)" python3 scripts/track1_quality_review.py --image-dir submissions/track1_candidate_v2/images --out-dir experiments/track1_champion_quality_review_candidate_v2_20260510 --model gemini-3-flash-preview --limit 40
+```
+
+Expected: each report summarizes high-priority replacement risks. Quota exhaustion should be reported with exact retry delay if present.
+
+- [ ] **Step 3: Run 0747 blind relation live dogfood**
+
+After Vulca blind gate is implemented, evaluate:
+
+```bash
+PYTHONPATH=/Users/yhryzy/dev/vulca/.worktrees/caption-fidelity-content-lock-v1/src:$PWD \
+GEMINI_API_KEY="$(security find-generic-password -s affectiveart-gemini-api-key -a gemini -w)" \
+VULCA_VLM_MODEL=gemini/gemini-3-flash-preview \
+python3 scripts/track1_challenger_130k_vulca.py \
+ --sample-id track1_0747 \
+ --out-dir experiments/track1_130k_compiler_gate_v1/blind_relation_gate_0747_dogfood \
+ --force
+```
+
+Expected: the generated create JSON contains a content fidelity gate whose blind relation result does not allow a pursuit/chase image to be accepted.
+
+## Self-Review Checklist
+
+- Spec coverage: covers generic Vulca gate, 0747 false-negative root cause, and challenge-side champion evaluation.
+- Placeholder scan: no TBD/TODO placeholders remain.
+- Type consistency: helper names are stable across tests and implementation snippets: `build_blind_relation_read_prompt`, `build_blind_relation_gate`, `blind_relation_decision`.
+- Submission safety: the plan does not mutate `/Users/yhryzy/dev/emoart-130k/submissions/track1_submission.json`, `/Users/yhryzy/dev/emoart-130k/submissions/track1_submission.zip`, or `/Users/yhryzy/dev/emoart-130k/submissions/track1/images`.
diff --git a/src/vulca/_parse.py b/src/vulca/_parse.py
index 740d5495..578dd7b0 100644
--- a/src/vulca/_parse.py
+++ b/src/vulca/_parse.py
@@ -28,6 +28,10 @@ def parse_llm_json(text: str) -> dict:
# Fix trailing commas before } or ]
text = re.sub(r",\s*([}\]])", r"\1", text)
+ # Fix a rare LLM typo where a key starts with two double quotes:
+ # { "L5": 0.7, ""missing_required_subjects": [] }
+ text = re.sub(r'([,{]\s*)""([A-Za-z_][A-Za-z0-9_]*"\s*:)', r'\1"\2', text)
+
# Fix single quotes → double quotes (careful with apostrophes in text)
# Only replace quotes that look like JSON keys/values
text = re.sub(r"(?<=[\[{,:])\s*'([^']*?)'\s*(?=[,}\]:])", r' "\1"', text)
diff --git a/src/vulca/_vlm.py b/src/vulca/_vlm.py
index 99c90966..a79a2b25 100644
--- a/src/vulca/_vlm.py
+++ b/src/vulca/_vlm.py
@@ -18,6 +18,7 @@
# Token budget: start low, escalate on truncation
_DEFAULT_MAX_TOKENS = 3072
_ESCALATED_MAX_TOKENS = 8192
+_CONTENT_LOCK_MAX_TOKENS = 16384
_MAX_ESCALATION_ATTEMPTS = 1
# Local (Ollama) models consistently emit >3072 tokens for the L1-L5 JSON
@@ -221,32 +222,64 @@ def _build_dynamic_suffix(
return "\n".join(p for p in parts if p)
def _extract_scoring(text: str) -> str:
- """Extract content inside the **last** ... block.
+ """Extract parseable scoring JSON from a two-phase VLM response.
- Implements the two-phase scratchpad protocol: the model writes free-form
- observations in tags (discarded), then structured JSON in
- tags (parsed). Falls back to full text for backward compatibility
- with responses that do not use the tag protocol.
-
- Uses rfind for the last ```` to avoid mis-matching when earlier
- text (e.g. observation or JSON values) accidentally contains ````.
+ Prefer the last valid block. If the model omits scoring tags,
+ strip scratchpad observation blocks and return the first balanced JSON
+ object. Falling back this way avoids Gemini scratchpad braces poisoning the
+ generic JSON parser.
"""
- close_tag = ""
- close_idx = text.rfind(close_tag)
- if close_idx == -1:
- return text
- prefix = text[:close_idx]
- open_tag = ""
- # Search backwards for the that yields content starting with '{'
- search_end = len(prefix)
- while True:
- open_idx = prefix.rfind(open_tag, 0, search_end)
- if open_idx == -1:
- return text
- candidate = prefix[open_idx + len(open_tag):].strip()
+ scoring_blocks = list(
+ re.finditer(
+ r"\s*(.*?)\s*",
+ text,
+ flags=re.IGNORECASE | re.DOTALL,
+ )
+ )
+ for match in reversed(scoring_blocks):
+ candidate = match.group(1).strip()
if candidate.startswith("{"):
return candidate
- search_end = open_idx
+
+ without_observation = re.sub(
+ r".*?",
+ "",
+ text,
+ flags=re.IGNORECASE | re.DOTALL,
+ )
+ json_candidate = _first_balanced_json(without_observation)
+ if json_candidate:
+ return json_candidate
+ return text.strip()
+
+
+def _first_balanced_json(text: str) -> str:
+ start = text.find("{")
+ if start < 0:
+ return ""
+
+ depth = 0
+ in_string = False
+ escaped = False
+ for index, char in enumerate(text[start:], start=start):
+ if in_string:
+ if escaped:
+ escaped = False
+ elif char == "\\":
+ escaped = True
+ elif char == '"':
+ in_string = False
+ continue
+
+ if char == '"':
+ in_string = True
+ elif char == "{":
+ depth += 1
+ elif char == "}":
+ depth -= 1
+ if depth == 0:
+ return text[start : index + 1].strip()
+ return ""
def _build_extra_dimensions_prompt(extras: list[dict]) -> str:
@@ -544,6 +577,7 @@ async def score_image(
*,
mode: str = "strict",
model: str = "",
+ content_lock: dict | None = None,
) -> dict:
"""Call Gemini Vision to score an image on L1-L5.
@@ -572,9 +606,21 @@ async def score_image(
{"type": "image_url", "image_url": {"url": f"data:{mime};base64,{img_b64}"}},
]
if subject:
- user_parts.append({"type": "text", "text": f"Subject/context: {subject}"})
+ user_text = f"Subject/context: {subject}"
else:
- user_parts.append({"type": "text", "text": "Evaluate this artwork."})
+ user_text = "Evaluate this artwork."
+ resolved_content_lock = None
+ if content_lock:
+ from vulca.content_lock import (
+ build_content_fidelity_prompt,
+ content_lock_from_dict,
+ )
+
+ resolved_content_lock = content_lock_from_dict(content_lock)
+ fidelity_prompt = build_content_fidelity_prompt(resolved_content_lock)
+ if fidelity_prompt:
+ user_text = f"{user_text}\n\n{fidelity_prompt}"
+ user_parts.append({"type": "text", "text": user_text})
try:
messages = [
@@ -591,10 +637,19 @@ async def score_image(
# Adaptive token budget: cloud models start small (cost-conscious),
# local models start at the escalated budget since tokens are free
- # and Gemma-class models regularly exceed 3072.
- max_tokens = _LOCAL_DEFAULT_MAX_TOKENS if is_local else _DEFAULT_MAX_TOKENS
+ # and Gemma-class models regularly exceed 3072. Content-lock scoring
+ # asks for additional gate fields, so allow one larger final attempt
+ # when the model truncates twice.
+ token_budgets = (
+ [_LOCAL_DEFAULT_MAX_TOKENS]
+ if is_local
+ else [_DEFAULT_MAX_TOKENS, _ESCALATED_MAX_TOKENS]
+ )
+ if content_lock and _CONTENT_LOCK_MAX_TOKENS not in token_budgets:
+ token_budgets.append(_CONTENT_LOCK_MAX_TOKENS)
+ max_tokens = token_budgets[0]
resp = None
- for attempt in range(_MAX_ESCALATION_ATTEMPTS + 1):
+ for attempt, max_tokens in enumerate(token_budgets):
# Local models (Ollama) need longer timeout for first load
timeout = 300 if model.startswith("ollama") else 55
call_kwargs = dict(
@@ -609,14 +664,13 @@ async def score_image(
call_kwargs["api_base"] = api_base
resp = await litellm.acompletion(**call_kwargs)
finish_reason = getattr(resp.choices[0], "finish_reason", "stop")
- if finish_reason == "length" and attempt < _MAX_ESCALATION_ATTEMPTS:
+ if finish_reason == "length" and attempt < len(token_budgets) - 1:
logger.info(
"VLM response truncated (finish_reason=length) at %d tokens; "
"escalating to %d tokens",
max_tokens,
- _ESCALATED_MAX_TOKENS,
+ token_budgets[attempt + 1],
)
- max_tokens = _ESCALATED_MAX_TOKENS
else:
break
@@ -651,6 +705,29 @@ async def score_image(
logger.debug("VLM debug dump failed: %s", _dump_exc)
parsed_json = parse_llm_json(scoring_text)
+ content_fidelity_gate = None
+ if resolved_content_lock is not None:
+ from vulca.content_lock import (
+ build_blind_relation_gate,
+ build_content_fidelity_gate,
+ )
+
+ content_fidelity_gate = build_content_fidelity_gate(
+ resolved_content_lock,
+ parsed_json,
+ )
+ if resolved_content_lock.required_relations:
+ blind_read = await _blind_relation_read(
+ img_b64=img_b64,
+ mime=mime,
+ api_key=api_key,
+ model=model,
+ api_base=api_base,
+ content_lock=resolved_content_lock,
+ )
+ content_fidelity_gate.update(
+ build_blind_relation_gate(resolved_content_lock, blind_read)
+ )
# Use _parse_vlm_response to extract and validate all fields (including extras)
parsed = _parse_vlm_response(parsed_json, extra_keys=extra_keys)
@@ -673,6 +750,8 @@ async def score_image(
data[f"{level}_reference_technique"] = ref_techniques.get(level, "")
# Include risk_flags so _engine.py can read it from the flat dict
data["risk_flags"] = parsed["risk_flags"]
+ if content_fidelity_gate is not None:
+ data["content_fidelity_gate"] = content_fidelity_gate
# Store extra_keys and names in data so _engine.py can split core vs extra
data["_extra_keys"] = extra_keys
data["_extra_names"] = {e["key"]: e["name"] for e in extra_dims[:3]}
@@ -690,3 +769,52 @@ async def score_image(
fallback[f"{level}_observations"] = ""
fallback[f"{level}_reference_technique"] = ""
return fallback
+
+
+async def _blind_relation_read(
+ *,
+ img_b64: str,
+ mime: str,
+ api_key: str,
+ model: str,
+ api_base: str | None,
+ content_lock,
+) -> dict:
+ """Run an image-only relation read without caption or intended-relation anchors."""
+ try:
+ from vulca._parse import parse_llm_json
+ from vulca.content_lock import build_blind_relation_read_prompt
+
+ prompt = build_blind_relation_read_prompt(content_lock)
+ if not prompt:
+ return {}
+
+ user_parts = [
+ {"type": "image_url", "image_url": {"url": f"data:{mime};base64,{img_b64}"}},
+ {"type": "text", "text": prompt},
+ ]
+ call_kwargs = dict(
+ model=model,
+ messages=[
+ {
+ "role": "system",
+ "content": (
+ "You are a strict image-only visual relationship reader. "
+ "Use only visible evidence in the image."
+ ),
+ },
+ {"role": "user", "content": user_parts},
+ ],
+ max_tokens=2048,
+ temperature=0.0,
+ api_key=api_key,
+ timeout=300 if model.startswith("ollama") else 55,
+ )
+ if api_base:
+ call_kwargs["api_base"] = api_base
+ resp = await litellm.acompletion(**call_kwargs)
+ text = resp.choices[0].message.content.strip()
+ return parse_llm_json(text)
+ except Exception as exc:
+ logger.warning("Blind relation read failed: %s", exc)
+ return {"_error": str(exc)}
diff --git a/src/vulca/cli.py b/src/vulca/cli.py
index 54812d6f..0862820a 100644
--- a/src/vulca/cli.py
+++ b/src/vulca/cli.py
@@ -107,6 +107,16 @@ def main(argv: list[str] | None = None) -> None:
create_p.add_argument("--ref-type", default="full", choices=["style", "composition", "full"],
help="Reference type: style, composition, or full")
create_p.add_argument("--colors", default="", help="Hex color palette (comma-separated, e.g. '#C87F4A,#5F8A50')")
+ create_p.add_argument(
+ "--content-lock",
+ action="store_true",
+ help="Treat explicit subjects and visible attributes in the intent as non-negotiable constraints",
+ )
+ create_p.add_argument(
+ "--output-is-artwork-itself",
+ action="store_true",
+ help="Require the output to be the artwork surface itself, not a gallery/photo/mockup display",
+ )
create_p.add_argument("--output", "-o", default="", help="Save generated image to this path (default: ./vulca-.png)")
# traditions command
@@ -843,6 +853,33 @@ def _cmd_create(args: argparse.Namespace) -> None:
node_params: dict[str, dict] = {}
if weights:
node_params["evaluate"] = {"custom_weights": weights}
+ if getattr(args, "content_lock", False) or getattr(args, "output_is_artwork_itself", False):
+ from vulca.content_lock import ContentLock, extract_content_lock
+
+ artifact_boundary = (
+ getattr(args, "output_is_artwork_itself", False)
+ or getattr(args, "content_lock", False)
+ )
+ if getattr(args, "content_lock", False):
+ lock = extract_content_lock(
+ args.intent,
+ output_is_artwork_itself=artifact_boundary,
+ )
+ else:
+ lock = ContentLock(
+ original_intent=" ".join(args.intent.strip().split()),
+ output_is_artwork_itself=artifact_boundary,
+ )
+ if lock.has_requirements or lock.output_is_artwork_itself:
+ lock_data = lock.to_dict()
+ node_params["generate"] = {
+ **node_params.get("generate", {}),
+ "content_lock": lock_data,
+ }
+ node_params["evaluate"] = {
+ **node_params.get("evaluate", {}),
+ "content_lock": lock_data,
+ }
pipeline_input = PipelineInput(
subject=args.subject or args.intent,
@@ -892,6 +929,8 @@ def _cmd_create(args: argparse.Namespace) -> None:
reference=getattr(args, "reference", "") or "",
ref_type=getattr(args, "ref_type", "full") or "full",
colors=getattr(args, "colors", "") or "",
+ content_lock=getattr(args, "content_lock", False),
+ output_is_artwork_itself=getattr(args, "output_is_artwork_itself", False),
)
except Exception as e:
print(f"Error: {e}", file=sys.stderr)
diff --git a/src/vulca/content_lock.py b/src/vulca/content_lock.py
new file mode 100644
index 00000000..ddbb0477
--- /dev/null
+++ b/src/vulca/content_lock.py
@@ -0,0 +1,818 @@
+"""Content-lock helpers for caption-faithful generation and evaluation."""
+
+from __future__ import annotations
+
+import re
+from dataclasses import dataclass, field
+from typing import Any
+
+
+@dataclass(frozen=True)
+class ContentLock:
+ """Explicit user-requested content that should survive style optimization."""
+
+ original_intent: str
+ required_subjects: list[str] = field(default_factory=list)
+ required_text_elements: list[str] = field(default_factory=list)
+ required_surface: list[str] = field(default_factory=list)
+ required_style_attributes: list[str] = field(default_factory=list)
+ required_mood: list[str] = field(default_factory=list)
+ required_relations: list[dict[str, str]] = field(default_factory=list)
+ composition_intent: str = ""
+ forbidden_readings: list[str] = field(default_factory=list)
+ output_is_artwork_itself: bool = False
+
+ @property
+ def has_requirements(self) -> bool:
+ return any(
+ (
+ self.required_subjects,
+ self.required_text_elements,
+ self.required_surface,
+ self.required_style_attributes,
+ self.required_mood,
+ self.required_relations,
+ self.composition_intent,
+ self.forbidden_readings,
+ )
+ )
+
+ def to_dict(self) -> dict[str, object]:
+ return {
+ "original_intent": self.original_intent,
+ "required_subjects": list(self.required_subjects),
+ "required_text_elements": list(self.required_text_elements),
+ "required_surface": list(self.required_surface),
+ "required_style_attributes": list(self.required_style_attributes),
+ "required_mood": list(self.required_mood),
+ "required_relations": [dict(relation) for relation in self.required_relations],
+ "composition_intent": self.composition_intent,
+ "forbidden_readings": list(self.forbidden_readings),
+ "output_is_artwork_itself": self.output_is_artwork_itself,
+ }
+
+
+def content_lock_from_dict(data: dict[str, Any] | ContentLock) -> ContentLock:
+ if isinstance(data, ContentLock):
+ return data
+ allowed = {
+ "original_intent",
+ "required_subjects",
+ "required_text_elements",
+ "required_surface",
+ "required_style_attributes",
+ "required_mood",
+ "required_relations",
+ "composition_intent",
+ "forbidden_readings",
+ "output_is_artwork_itself",
+ }
+ cleaned = {key: value for key, value in data.items() if key in allowed}
+ return ContentLock(
+ original_intent=str(cleaned.get("original_intent") or ""),
+ required_subjects=_as_string_list(cleaned.get("required_subjects")),
+ required_text_elements=_as_string_list(cleaned.get("required_text_elements")),
+ required_surface=_as_string_list(cleaned.get("required_surface")),
+ required_style_attributes=_as_string_list(cleaned.get("required_style_attributes")),
+ required_mood=_as_string_list(cleaned.get("required_mood")),
+ required_relations=_as_relation_list(cleaned.get("required_relations")),
+ composition_intent=str(cleaned.get("composition_intent") or ""),
+ forbidden_readings=_as_string_list(cleaned.get("forbidden_readings")),
+ output_is_artwork_itself=bool(cleaned.get("output_is_artwork_itself")),
+ )
+
+
+def extract_content_lock(
+ intent: str,
+ *,
+ output_is_artwork_itself: bool = True,
+) -> ContentLock:
+ """Extract explicit visual requirements from a short caption-like intent.
+
+ This is intentionally conservative: it locks concrete, named objects and
+ visible text/material requirements, but leaves broad style words to the
+ tradition guidance unless they are clearly phrased as explicit attributes.
+ """
+ text = " ".join(intent.strip().split())
+ lower = text.lower()
+
+ subjects = _extract_subjects(text)
+ subjects = _replace_known_subjects(subjects, lower)
+ subjects.extend(_extract_keyword_subjects(lower))
+ (
+ relation_subjects,
+ required_relations,
+ composition_intent,
+ forbidden_readings,
+ ) = _extract_relation_semantics(lower)
+ subjects.extend(relation_subjects)
+
+ text_elements: list[str] = []
+ if re.search(r"\bcircular calligraphy panel\b", lower):
+ text_elements.append("circular calligraphy panel")
+ elif re.search(r"\bvertical chinese calligraphy\b", lower):
+ text_elements.append("vertical Chinese calligraphy")
+ elif re.search(r"\bcalligraphy along the side\b", lower):
+ text_elements.append("calligraphy along the side")
+ elif re.search(r"\bcalligraphy\b", lower):
+ text_elements.append("calligraphy")
+ if re.search(r"\bred seals?\b", lower):
+ text_elements.append("red seals")
+
+ surface: list[str] = []
+ if re.search(r"\baged paper\b", lower):
+ surface.append("aged paper")
+ if re.search(r"\bgraph paper\b", lower):
+ surface.append("graph paper")
+ if re.search(r"\bpale beige silk ground\b", lower):
+ surface.append("pale beige silk ground")
+ if re.search(r"\bornate pale patterned border\b", lower):
+ surface.append("ornate pale patterned border")
+
+ style_attributes: list[str] = []
+ if re.search(r"\bgongbi vertical hanging scroll\b", lower):
+ style_attributes.append("Gongbi vertical hanging scroll")
+ elif re.search(r"\bvertical hanging scroll\b", lower):
+ style_attributes.append("vertical hanging scroll")
+ if re.search(r"\bgongbi album leaf\b", lower):
+ style_attributes.append("Gongbi album leaf")
+ if re.search(r"\brectangular frame\b", lower):
+ style_attributes.append("rectangular frame")
+ if re.search(r"\bmonochrome pencil style\b", lower):
+ style_attributes.append("monochrome pencil style")
+ if re.search(r"\bdelicate linework\b", lower):
+ style_attributes.append("delicate linework")
+ if re.search(r"\bmuted brown tones\b", lower):
+ style_attributes.append("muted brown tones")
+ if re.search(r"\bsparse brushwork\b", lower):
+ style_attributes.append("sparse brushwork")
+
+ mood: list[str] = []
+ if re.search(r"\bcalm scholarly composition\b", lower):
+ mood.append("calm scholarly composition")
+
+ return ContentLock(
+ original_intent=text,
+ required_subjects=subjects,
+ required_text_elements=_dedupe(text_elements),
+ required_surface=_dedupe(surface),
+ required_style_attributes=_dedupe(style_attributes),
+ required_mood=_dedupe(mood),
+ required_relations=required_relations,
+ composition_intent=composition_intent,
+ forbidden_readings=forbidden_readings,
+ output_is_artwork_itself=output_is_artwork_itself,
+ )
+
+
+def build_content_lock_prompt(lock: ContentLock) -> str:
+ """Build generation instructions that make explicit content non-negotiable."""
+ if not lock.has_requirements and not lock.output_is_artwork_itself:
+ return ""
+
+ lines: list[str] = []
+ if lock.output_is_artwork_itself:
+ lines.extend(_build_artifact_boundary_lines(lock.original_intent))
+
+ if lock.has_requirements:
+ if lines:
+ lines.append("")
+ lines.extend(
+ [
+ "NON-NEGOTIABLE CONTENT REQUIREMENTS:",
+ (
+ "The following requirements come from the user's explicit request "
+ "and must be satisfied before style optimization."
+ ),
+ ]
+ )
+ if lock.required_subjects:
+ lines.append(f"- Required subjects: {', '.join(lock.required_subjects)}.")
+ if lock.required_text_elements:
+ lines.append(
+ f"- Required text/seal elements: {', '.join(lock.required_text_elements)}."
+ )
+ if lock.required_surface:
+ lines.append(f"- Required surface/material: {', '.join(lock.required_surface)}.")
+ if lock.required_style_attributes:
+ lines.append(
+ f"- Required style attributes: {', '.join(lock.required_style_attributes)}."
+ )
+ if lock.required_mood:
+ lines.append(f"- Required mood/composition: {', '.join(lock.required_mood)}.")
+ if lock.required_relations:
+ lines.append("RELATION SEMANTICS REQUIREMENTS:")
+ for relation in lock.required_relations:
+ lines.append(f"- {_format_required_relation(relation)}.")
+ if lock.composition_intent:
+ lines.append(f"COMPOSITION INTENT: {lock.composition_intent}.")
+ if lock.forbidden_readings:
+ lines.append(f"FORBIDDEN RELATION READINGS: {', '.join(lock.forbidden_readings)}.")
+ if lock.has_requirements:
+ lines.append(
+ "Do not replace these subjects with mountains, generic landscapes, "
+ "or unrelated tradition prototypes."
+ )
+ lines.append(
+ "Do not render sample IDs, filenames, watermarks, large labels, gallery "
+ "walls, exhibition labels, framed museum installations, or photographed "
+ "artwork mockups unless the user explicitly requested them."
+ )
+ lines.append(
+ "If any required subject is absent, the image is a failed response even "
+ "if the cultural style is strong."
+ )
+ return "\n".join(lines)
+
+
+def build_content_fidelity_prompt(lock: ContentLock) -> str:
+ """Build VLM scoring instructions for explicit content presence checks."""
+ if not lock.has_requirements and not lock.output_is_artwork_itself:
+ return ""
+
+ lines = [
+ "CONTENT FIDELITY CHECK:",
+ (
+ "Before final scoring, verify whether the artwork visibly contains "
+ "the user's non-negotiable content requirements."
+ ),
+ ]
+ if lock.required_subjects:
+ lines.append(f"- Required subjects: {', '.join(lock.required_subjects)}")
+ if lock.required_text_elements:
+ lines.append(
+ f"- Required text/seal elements: {', '.join(lock.required_text_elements)}"
+ )
+ if lock.required_surface:
+ lines.append(f"- Required surface/material: {', '.join(lock.required_surface)}")
+ if lock.required_style_attributes:
+ lines.append(
+ f"- Required style attributes: {', '.join(lock.required_style_attributes)}"
+ )
+ if lock.required_relations:
+ lines.append(
+ "- Required relations: "
+ f"{'; '.join(_format_required_relation(relation) for relation in lock.required_relations)}"
+ )
+ if lock.composition_intent:
+ lines.append(f"- Required composition intent: {lock.composition_intent}")
+ if lock.forbidden_readings:
+ lines.append(f"- Forbidden relation readings: {', '.join(lock.forbidden_readings)}")
+ if lock.output_is_artwork_itself:
+ lines.append(
+ (
+ "- Required artifact boundary: output_is_artwork_itself must be true; "
+ "the image must be the requested artwork surface, not a photo, "
+ "gallery scene, installation, catalog/mockup, or framed display."
+ )
+ )
+ lines.extend(
+ [
+ (
+ "Also check for forbidden visual artifacts: visible sample IDs, "
+ "filenames, watermarks, large labels, gallery walls, exhibition "
+ "labels, framed museum installations, and photographed artwork mockups."
+ ),
+ "Add these exact fields to the JSON inside :",
+ '"missing_required_subjects": [strings copied exactly from the required subjects list],',
+ '"missing_required_text_elements": [strings copied exactly from the required text/seal list],',
+ '"missing_required_surface": [strings copied exactly from the required surface/material list].',
+ '"missing_required_style_attributes": [strings copied exactly from the required style attributes list],',
+ '"apparent_relations": [short strings describing visible subject-relation-object readings],',
+ '"relation_semantics_failed": true or false,',
+ '"forbidden_readings_present": [strings copied from forbidden relation readings, or close visual readings],',
+ '"forbidden_visual_artifacts": [visible forbidden artifacts, or an empty list].',
+ '"unwanted_visible_text": true or false,',
+ '"output_is_artwork_itself": true or false.',
+ "Use an empty list when every item in a category is visible or no forbidden artifact is present.",
+ ]
+ )
+ return "\n".join(lines)
+
+
+def build_blind_relation_read_prompt(lock: ContentLock | dict[str, Any]) -> str:
+ """Build an image-only relation-reading prompt without caption anchors."""
+ content_lock = content_lock_from_dict(lock) if isinstance(lock, dict) else lock
+ if not content_lock.required_relations:
+ return ""
+
+ return "\n".join(
+ [
+ "BLIND IMAGE RELATION READ:",
+ (
+ "Describe only what is visible in the image. Do not use any "
+ "external prompt, sample id, filename, or expected story."
+ ),
+ (
+ "Focus on visible relationships among people, animals, vehicles, "
+ "objects, threats, movement direction, gaze, weapons, gestures, "
+ "and safety cues."
+ ),
+ "Return exactly one JSON object with these fields:",
+ '"visible_entities": [short strings],',
+ (
+ '"primary_reading": "one sentence describing the most natural '
+ 'visible relationship reading",'
+ ),
+ (
+ '"apparent_relations": [short subject-relation-object strings '
+ 'visible in the image],'
+ ),
+ '"threat_cues": [short strings],',
+ '"safety_cues": [short strings],',
+ (
+ '"ambiguous_readings": [short strings for plausible alternate '
+ 'readings, or empty list],'
+ ),
+ '"confidence": number from 0.0 to 1.0.',
+ ]
+ )
+
+
+def build_blind_relation_gate(
+ lock: ContentLock | dict[str, Any],
+ blind_read: dict[str, Any] | None,
+) -> dict[str, Any]:
+ """Compare image-only relation reading against required relations."""
+ content_lock = content_lock_from_dict(lock) if isinstance(lock, dict) else lock
+ if not content_lock.required_relations:
+ return {"blind_relation_decision": "not_applicable"}
+ if not blind_read:
+ return {
+ "blind_relation_decision": "unavailable",
+ "blind_relation_reason": "blind relation read unavailable",
+ }
+ if blind_read.get("_error"):
+ return {
+ "blind_relation_decision": "unavailable",
+ "blind_relation_reason": str(blind_read.get("_error")),
+ }
+
+ primary = str(blind_read.get("primary_reading") or "")
+ apparent = _as_string_list(blind_read.get("apparent_relations"))
+ ambiguous = _as_string_list(blind_read.get("ambiguous_readings"))
+ joined = " ".join([primary, *apparent]).lower()
+ forbidden_present = [
+ reading
+ for reading in content_lock.forbidden_readings
+ if _relation_reading_matches(reading, joined)
+ ]
+
+ decision = "pass"
+ reason = "blind read did not contradict required relations"
+ has_high_confidence_forbidden = any(
+ reading != "soldiers chasing civilians" for reading in forbidden_present
+ )
+ if forbidden_present and has_high_confidence_forbidden:
+ decision = "reject"
+ reason = "blind read matches forbidden relation reading"
+ elif ambiguous:
+ decision = "hold"
+ reason = "blind read is ambiguous for required relations"
+ elif forbidden_present:
+ decision = "reject"
+ reason = "blind read matches forbidden relation reading"
+
+ return {
+ "blind_relation_decision": decision,
+ "blind_relation_reason": reason,
+ "blind_primary_reading": primary,
+ "blind_apparent_relations": apparent,
+ "blind_ambiguous_readings": ambiguous,
+ "blind_forbidden_readings_present": forbidden_present,
+ }
+
+
+def build_content_fidelity_gate(
+ lock: ContentLock | dict[str, Any],
+ scoring_data: dict[str, Any],
+) -> dict[str, Any]:
+ """Create deterministic gate data from VLM missing-item fields."""
+ content_lock = content_lock_from_dict(lock) if isinstance(lock, dict) else lock
+ apparent_relations = _as_string_list(scoring_data.get("apparent_relations"))
+ forbidden_artifacts = _as_string_list(
+ scoring_data.get("forbidden_visual_artifacts")
+ )
+ inferred_artifacts = _infer_artifacts_from_readings(content_lock, apparent_relations)
+ unwanted_visible_text = _as_optional_bool(scoring_data.get("unwanted_visible_text"))
+ if "unrequested visible text labels" in inferred_artifacts:
+ unwanted_visible_text = True
+ return {
+ "required_subjects": list(content_lock.required_subjects),
+ "missing_required_subjects": _as_string_list(
+ scoring_data.get("missing_required_subjects")
+ ),
+ "required_text_elements": list(content_lock.required_text_elements),
+ "missing_required_text_elements": _as_string_list(
+ scoring_data.get("missing_required_text_elements")
+ ),
+ "required_surface": list(content_lock.required_surface),
+ "missing_required_surface": _as_string_list(
+ scoring_data.get("missing_required_surface")
+ ),
+ "required_style_attributes": list(content_lock.required_style_attributes),
+ "missing_required_style_attributes": _as_string_list(
+ scoring_data.get("missing_required_style_attributes")
+ ),
+ "required_relations": [
+ dict(relation) for relation in content_lock.required_relations
+ ],
+ "apparent_relations": apparent_relations,
+ "relation_semantics_failed": _as_optional_bool(
+ scoring_data.get("relation_semantics_failed")
+ ),
+ "forbidden_readings": list(content_lock.forbidden_readings),
+ "forbidden_readings_present": _as_string_list(
+ scoring_data.get("forbidden_readings_present")
+ ),
+ "forbidden_visual_artifacts": _dedupe([*forbidden_artifacts, *inferred_artifacts]),
+ "required_output_is_artwork_itself": content_lock.output_is_artwork_itself,
+ "output_is_artwork_itself": _as_optional_bool(
+ scoring_data.get("output_is_artwork_itself")
+ ),
+ "unwanted_visible_text": unwanted_visible_text,
+ }
+
+
+def apply_content_fidelity_gate(result: dict[str, Any], gate: dict[str, Any]) -> dict[str, Any]:
+ """Cap high scores when required caption content is known missing."""
+ missing_subjects = _as_string_list(gate.get("missing_required_subjects"))
+ missing_text = _as_string_list(gate.get("missing_required_text_elements"))
+ missing_surface = _as_string_list(gate.get("missing_required_surface"))
+ missing_style = _as_string_list(gate.get("missing_required_style_attributes"))
+ relation_semantics_failed = (
+ _as_optional_bool(gate.get("relation_semantics_failed")) is True
+ )
+ forbidden_readings_present = _as_string_list(
+ gate.get("forbidden_readings_present")
+ )
+ forbidden_artifacts = _as_string_list(gate.get("forbidden_visual_artifacts"))
+ required_artwork_itself = bool(gate.get("required_output_is_artwork_itself"))
+ output_is_artwork_itself = gate.get("output_is_artwork_itself")
+ unwanted_visible_text = gate.get("unwanted_visible_text")
+ artifact_boundary_failed = (
+ required_artwork_itself and output_is_artwork_itself is False
+ )
+ unwanted_text_failed = unwanted_visible_text is True
+ blind_relation_decision = str(gate.get("blind_relation_decision") or "")
+ blind_relation_failed = blind_relation_decision in {"reject", "hold"}
+
+ if not (
+ missing_subjects
+ or missing_text
+ or missing_surface
+ or missing_style
+ or relation_semantics_failed
+ or forbidden_readings_present
+ or forbidden_artifacts
+ or artifact_boundary_failed
+ or unwanted_text_failed
+ or blind_relation_failed
+ ):
+ return result
+
+ updated = dict(result)
+ scores = dict(updated.get("scores") or {})
+ for key in ("L1", "L3", "L4", "L5"):
+ scores[key] = min(float(scores.get(key, 0.0)), 0.25)
+ updated["scores"] = scores
+ updated["weighted_total"] = min(float(updated.get("weighted_total", 0.0)), 0.25)
+
+ rationale_parts: list[str] = []
+ if missing_subjects:
+ rationale_parts.append(f"Missing required subjects: {', '.join(missing_subjects)}")
+ if missing_text:
+ rationale_parts.append(
+ f"Missing required text elements: {', '.join(missing_text)}"
+ )
+ if missing_surface:
+ rationale_parts.append(
+ f"Missing required surface/material: {', '.join(missing_surface)}"
+ )
+ if missing_style:
+ rationale_parts.append(
+ f"Missing required style attributes: {', '.join(missing_style)}"
+ )
+ if relation_semantics_failed:
+ rationale_parts.append("Relation semantics failed")
+ if forbidden_readings_present:
+ rationale_parts.append(
+ f"Forbidden relation readings: {', '.join(forbidden_readings_present)}"
+ )
+ if forbidden_artifacts:
+ rationale_parts.append(
+ f"Forbidden visual artifacts: {', '.join(forbidden_artifacts)}"
+ )
+ if artifact_boundary_failed:
+ rationale_parts.append("Output is not the artwork itself")
+ if unwanted_text_failed:
+ rationale_parts.append("Unwanted visible text")
+ if blind_relation_failed:
+ rationale_parts.append(
+ "Blind relation gate rejected image: "
+ f"{gate.get('blind_relation_reason') or blind_relation_decision}"
+ )
+ rationales = dict(updated.get("rationales") or {})
+ rationales["content_fidelity"] = "; ".join(rationale_parts)
+ updated["rationales"] = rationales
+
+ risk_flags = list(updated.get("risk_flags") or [])
+ if "content_fidelity_failed" not in risk_flags:
+ risk_flags.append("content_fidelity_failed")
+ updated["risk_flags"] = risk_flags
+ updated["content_fidelity_gate"] = dict(gate)
+ return updated
+
+
+def _extract_subjects(text: str) -> list[str]:
+ match = re.search(
+ r"\b(?:of|showing|featuring|depicting)\s+(.+?)(?:\s+"
+ r"(?:beside|with|on|under|against|in|at|near|over|around)\b|,\s+with\b|$)",
+ text,
+ flags=re.IGNORECASE,
+ )
+ if not match:
+ return []
+
+ segment = match.group(1)
+ pieces = re.split(r",\s*(?:and\s+)?|\s+and\s+", segment)
+ subjects = []
+ for piece in pieces:
+ normalized = _clean_subject(piece)
+ if normalized:
+ subjects.append(normalized)
+ return _dedupe(subjects)
+
+
+def _infer_artifacts_from_readings(
+ lock: ContentLock,
+ apparent_relations: list[str],
+) -> list[str]:
+ """Infer obvious artifact-boundary failures from VLM free-text readings."""
+ joined = " ".join(apparent_relations).lower()
+ artifacts: list[str] = []
+ if re.search(
+ r"\b(meme|memes|social|icon|icons|qr|chat|ui|interface|screen|app|"
+ r"notification|overlay|collage)\b",
+ joined,
+ ):
+ artifacts.append("modern UI/collage artifacts")
+ if _has_unrequested_text_label_reading(lock, joined):
+ artifacts.append("unrequested visible text labels")
+ return artifacts
+
+
+def _has_unrequested_text_label_reading(lock: ContentLock, joined: str) -> bool:
+ if not re.search(
+ r"\b(english|label|labels|text|caption|captions|metadata|acquisition|"
+ r"condition report|concept|concepts|speech bubble)\b",
+ joined,
+ ):
+ return False
+ if re.search(
+ r"\b(english|metadata|acquisition|condition report|concept|concepts|"
+ r"speech bubble|sample id|filename)\b",
+ joined,
+ ):
+ return True
+
+ allowed_text_terms = [
+ *lock.required_text_elements,
+ "cyrillic" if "cyrillic" in lock.original_intent.lower() else "",
+ "calligraphy" if "calligraphy" in lock.original_intent.lower() else "",
+ "lettering" if "lettering" in lock.original_intent.lower() else "",
+ ]
+ normalized_allowed = [term.lower() for term in allowed_text_terms if term]
+ if normalized_allowed and any(term in joined for term in normalized_allowed):
+ return False
+ return True
+
+
+def _replace_known_subjects(subjects: list[str], lower: str) -> list[str]:
+ known: list[str] = []
+ if re.search(r"\bbamboo\b", lower):
+ known.append("bamboo")
+ if re.search(r"\borchid grasses?\b", lower):
+ known.append("orchid grasses")
+ elif re.search(r"\borchids?\b", lower):
+ known.append("orchids")
+ if known:
+ remaining = [
+ subject
+ for subject in subjects
+ if "bamboo" not in subject and "orchid" not in subject
+ ]
+ return _dedupe([*known, *remaining])
+ return subjects
+
+
+def _build_artifact_boundary_lines(intent: str) -> list[str]:
+ lower = intent.lower()
+ lines = [
+ "ARTIFACT BOUNDARY REQUIREMENT:",
+ (
+ "The output image must be the artwork itself, not a photograph or "
+ "display of the artwork."
+ ),
+ (
+ "Fill the entire canvas with the requested poster/scroll/album/artwork "
+ "surface."
+ ),
+ (
+ "Do not include gallery walls, museum displays, framed mockups, "
+ "installation views, catalog layouts, UI screens, QR codes, filename "
+ "labels, sample IDs, watermarks, or unrequested readable text."
+ ),
+ "Do not show the artwork as an object in a room.",
+ ]
+ if re.search(r"\bposter\b|\bpropaganda poster\b", lower):
+ lines.append(
+ "Render a flat, front-facing propaganda poster artwork that fills the canvas."
+ )
+ lines.append("Do not render a poster hanging on a wall or photographed in a room.")
+ if re.search(r"\bscroll\b|\balbum leaf\b|\balbum-leaf\b", lower):
+ lines.append("Render the scroll/album-leaf artwork as the primary image surface.")
+ lines.append(
+ "Do not render a gallery wall, catalog spread, side-by-side detail mockup, "
+ "or framed display."
+ )
+ return lines
+
+
+def _extract_keyword_subjects(lower: str) -> list[str]:
+ subjects: list[str] = []
+ for pattern, label in (
+ (r"\blotus blossoms?\b", "lotus blossoms"),
+ (r"\bslender stems?\b", "slender stems"),
+ (r"\bsmall leaves\b", "small leaves"),
+ (r"\bdense tree-like network\b", "dense tree-like network"),
+ (r"\bsmall heart\b", "small heart"),
+ (r"\bgeometric marks\b", "geometric marks"),
+ (r"\bsparse branches\b", "sparse branches"),
+ (r"\bworkers?\b", "workers"),
+ (r"\bred banners?\b", "red banners"),
+ ):
+ if re.search(pattern, lower):
+ subjects.append(label)
+ if re.search(r"\bhand-drawn branching lines\b", lower):
+ subjects.append("hand-drawn branching lines")
+ elif re.search(r"\bbranching lines\b", lower):
+ subjects.append("branching lines")
+ if re.search(r"\bfinely detailed bird\b", lower):
+ subjects.append("finely detailed bird")
+ elif re.search(r"\bsmall bird\b", lower):
+ subjects.append("small bird")
+ elif re.search(r"\bbird\b", lower):
+ subjects.append("bird")
+ return _dedupe(subjects)
+
+
+def _extract_relation_semantics(
+ lower: str,
+) -> tuple[list[str], list[dict[str, str]], str, list[str]]:
+ """Extract conservative subject-relation-object locks from narrative captions."""
+ has_mounted_soldiers = bool(
+ re.search(r"\bmounted(?:\s+[a-z-]+){0,3}\s+soldiers?\b", lower)
+ )
+ has_fleeing_civilians = bool(
+ re.search(r"\b(?:fleeing|evacuating|displaced)\s+civilians?\b", lower)
+ or re.search(r"\bcivilians?\s+(?:as\s+they\s+)?(?:flee|evacuate)\b", lower)
+ or re.search(r"\bcivilians?\s+(?:fleeing|evacuating|displaced)\b", lower)
+ )
+ has_burning_village_ruins = bool(
+ re.search(r"\bburning village ruins?\b|\bburning villages?\b", lower)
+ )
+ has_aircraft_overhead = bool(
+ re.search(r"\baircraft overhead\b|\baircraft\b|\bplanes? overhead\b", lower)
+ )
+
+ if not (has_mounted_soldiers and has_fleeing_civilians and has_burning_village_ruins):
+ return [], [], "", []
+
+ subjects = [
+ "mounted soldiers",
+ "fleeing civilians",
+ "burning village ruins",
+ ]
+ relations = [
+ {
+ "subject": "mounted soldiers",
+ "relation": "escort/protect",
+ "object": "fleeing civilians",
+ },
+ {
+ "subject": "fleeing civilians",
+ "relation": "evacuate_from",
+ "object": "burning village ruins",
+ },
+ ]
+ composition_intent = (
+ "mounted soldiers must read as escort/protect figures for fleeing "
+ "civilians while the civilians evacuate from burning village ruins"
+ )
+ if has_aircraft_overhead:
+ subjects.append("aircraft overhead")
+ relations.append(
+ {
+ "subject": "aircraft overhead",
+ "relation": "overhead_threat_or_wartime_context",
+ "object": "scene",
+ }
+ )
+ composition_intent += (
+ " and the aircraft overhead reads as wartime threat/context"
+ )
+
+ forbidden_readings = [
+ "soldiers chasing civilians",
+ "soldiers attacking civilians",
+ "civilians threatened by soldiers",
+ ]
+ return subjects, relations, composition_intent, forbidden_readings
+
+
+def _format_required_relation(relation: dict[str, str]) -> str:
+ subject = relation.get("subject", "").strip()
+ predicate = relation.get("relation", "").strip()
+ obj = relation.get("object", "").strip()
+ if subject and predicate and obj:
+ return f"{subject} must read as {predicate} {obj}"
+ if subject and predicate:
+ return f"{subject} must read as {predicate}"
+ return subject or predicate or obj
+
+
+def _clean_subject(value: str) -> str:
+ value = value.strip(" .,:;")
+ value = re.sub(r"^(?:a|an|the)\s+", "", value, flags=re.IGNORECASE)
+ value = re.sub(r"\b(?:delicate|detailed|high-quality)\s+", "", value, flags=re.IGNORECASE)
+ return " ".join(value.split())
+
+
+def _as_string_list(value: Any) -> list[str]:
+ if not value:
+ return []
+ if isinstance(value, str):
+ return [value]
+ if isinstance(value, (list, tuple)):
+ return [str(item) for item in value if str(item).strip()]
+ return []
+
+
+def _as_relation_list(value: Any) -> list[dict[str, str]]:
+ if not isinstance(value, (list, tuple)):
+ return []
+ relations: list[dict[str, str]] = []
+ for item in value:
+ if not isinstance(item, dict):
+ continue
+ subject = str(item.get("subject") or "").strip()
+ relation = str(item.get("relation") or "").strip()
+ obj = str(item.get("object") or "").strip()
+ if not (subject and relation and obj):
+ continue
+ relations.append({"subject": subject, "relation": relation, "object": obj})
+ return relations
+
+
+def _as_optional_bool(value: Any) -> bool | None:
+ if value is None:
+ return None
+ if isinstance(value, bool):
+ return value
+ if isinstance(value, str):
+ normalized = value.strip().lower()
+ if normalized in {"true", "yes", "1"}:
+ return True
+ if normalized in {"false", "no", "0"}:
+ return False
+ return None
+
+
+def _relation_reading_matches(reading: str, joined: str) -> bool:
+ normalized = reading.lower()
+ has_soldiers = "soldier" in joined or "rider" in joined or "mounted" in joined
+ has_civilians = "civilian" in joined or "people" in joined or "refugee" in joined
+ if "chasing" in normalized:
+ return has_soldiers and has_civilians and re.search(r"\bchas\w*|\bpursu\w*", joined) is not None
+ if "attacking" in normalized:
+ return has_soldiers and has_civilians and re.search(r"\battack\w*|\bassault\w*|\bshoot\w*", joined) is not None
+ if "threatened" in normalized:
+ return has_soldiers and has_civilians and re.search(
+ r"\bthreat\w*|\bmenac\w*|\bbrandish\w*|\bdrawn\s+swords?\b|"
+ r"\bcharge\w*\b|\bweapon\w*",
+ joined,
+ ) is not None
+ return reading.lower() in joined
+
+
+def _dedupe(values: list[str]) -> list[str]:
+ seen: set[str] = set()
+ result: list[str] = []
+ for value in values:
+ key = value.lower()
+ if key in seen:
+ continue
+ seen.add(key)
+ result.append(value)
+ return result
diff --git a/src/vulca/create.py b/src/vulca/create.py
index 2dcf447d..b09429ed 100644
--- a/src/vulca/create.py
+++ b/src/vulca/create.py
@@ -25,6 +25,8 @@ async def acreate(
reference: str = "",
ref_type: str = "full",
colors: str = "",
+ content_lock: bool = False,
+ output_is_artwork_itself: bool = False,
) -> CreateResult:
"""Create artwork via local pipeline or remote API (async).
@@ -55,6 +57,12 @@ async def acreate(
reference:
Reference image path or base64. Also serves as sketch input --
providers treat both identically as ``reference_image_b64``.
+ content_lock:
+ Treat explicit subjects and visible attributes in the intent as
+ non-negotiable generation and evaluation requirements.
+ output_is_artwork_itself:
+ Treat the requested artwork as the full output surface. Generation and
+ evaluation should reject gallery/display/mockup artifacts.
Returns
-------
@@ -80,6 +88,9 @@ async def acreate(
reference=reference,
ref_type=ref_type,
colors=colors,
+ api_key=api_key,
+ content_lock=content_lock,
+ output_is_artwork_itself=output_is_artwork_itself,
)
return await _create_remote(
intent,
@@ -88,6 +99,8 @@ async def acreate(
provider=provider,
base_url=base_url,
api_key=api_key,
+ content_lock=content_lock,
+ output_is_artwork_itself=output_is_artwork_itself,
)
@@ -104,6 +117,9 @@ async def _create_local(
reference: str = "",
ref_type: str = "full",
colors: str = "",
+ api_key: str = "",
+ content_lock: bool = False,
+ output_is_artwork_itself: bool = False,
) -> CreateResult:
"""Run the slim pipeline engine locally."""
from vulca._image import resolve_image_input
@@ -117,6 +133,31 @@ async def _create_local(
if weights:
node_params["evaluate"] = {"custom_weights": weights}
+ if content_lock or output_is_artwork_itself:
+ from vulca.content_lock import ContentLock, extract_content_lock
+
+ artifact_boundary = output_is_artwork_itself or content_lock
+ if content_lock:
+ lock = extract_content_lock(
+ intent,
+ output_is_artwork_itself=artifact_boundary,
+ )
+ else:
+ lock = ContentLock(
+ original_intent=" ".join(intent.strip().split()),
+ output_is_artwork_itself=artifact_boundary,
+ )
+ if lock.has_requirements or lock.output_is_artwork_itself:
+ lock_data = lock.to_dict()
+ node_params["generate"] = {
+ **node_params.get("generate", {}),
+ "content_lock": lock_data,
+ }
+ node_params["evaluate"] = {
+ **node_params.get("evaluate", {}),
+ "content_lock": lock_data,
+ }
+
# Inject reference/colors into generate node params
gen_params: dict[str, Any] = {}
if reference:
@@ -132,6 +173,7 @@ async def _create_local(
intent=intent,
tradition=tradition or "default",
provider=provider,
+ api_key=api_key,
node_params=node_params,
image_provider=image_provider,
eval_mode=eval_mode,
@@ -163,6 +205,7 @@ async def _create_local(
elif event.payload.get("image_b64"):
best_image_b64 = event.payload["image_b64"]
+ output_dict = output.to_dict()
return CreateResult(
session_id=output.session_id,
mode="create",
@@ -178,12 +221,16 @@ async def _create_local(
rounds=[r.to_dict() for r in output.rounds],
summary=output.summary,
recommendations=output.recommendations,
+ risk_flags=output.risk_flags,
+ content_fidelity_gate=output.content_fidelity_gate,
+ evaluation_source=output.evaluation_source,
+ evaluation_error=output.evaluation_error,
suggestions=suggestions,
deviation_types=deviation_types,
eval_mode=eval_mode,
latency_ms=output.total_latency_ms,
cost_usd=output.total_cost_usd,
- raw=output.to_dict(),
+ raw=output_dict,
)
@@ -195,6 +242,8 @@ async def _create_remote(
provider: str = "nb2",
base_url: str = "",
api_key: str = "",
+ content_lock: bool = False,
+ output_is_artwork_itself: bool = False,
) -> CreateResult:
"""Call remote VULCA API for creation."""
import httpx
@@ -209,6 +258,10 @@ async def _create_remote(
"provider": provider,
"stream": False,
}
+ if content_lock:
+ body["content_lock"] = True
+ if output_is_artwork_itself or content_lock:
+ body["output_is_artwork_itself"] = True
headers = {"Content-Type": "application/json"}
if key:
@@ -235,6 +288,10 @@ async def _create_remote(
rounds=data.get("rounds") or [],
summary=data.get("summary") or "",
recommendations=data.get("recommendations") or [],
+ risk_flags=data.get("risk_flags") or [],
+ content_fidelity_gate=data.get("content_fidelity_gate") or {},
+ evaluation_source=data.get("evaluation_source") or "",
+ evaluation_error=data.get("evaluation_error") or "",
latency_ms=data.get("latency_ms", 0),
cost_usd=data.get("cost_usd", 0.0),
raw=data,
@@ -257,6 +314,8 @@ def create(
reference: str = "",
ref_type: str = "full",
colors: str = "",
+ content_lock: bool = False,
+ output_is_artwork_itself: bool = False,
) -> CreateResult:
"""Create artwork (synchronous wrapper).
@@ -282,6 +341,8 @@ def create(
reference=reference,
ref_type=ref_type,
colors=colors,
+ content_lock=content_lock,
+ output_is_artwork_itself=output_is_artwork_itself,
)
if loop and loop.is_running():
diff --git a/src/vulca/pipeline/engine.py b/src/vulca/pipeline/engine.py
index bfb85f72..90476637 100644
--- a/src/vulca/pipeline/engine.py
+++ b/src/vulca/pipeline/engine.py
@@ -574,6 +574,10 @@ async def _run_one(name: str, _nodes: dict = node_instances, _ctx: object = ctx)
total_rounds=len(rounds),
total_latency_ms=total_ms,
total_cost_usd=ctx.cost_usd,
+ risk_flags=ctx.get("risk_flags", []),
+ content_fidelity_gate=ctx.get("content_fidelity_gate", {}) or {},
+ evaluation_source=ctx.get("evaluation_source", ""),
+ evaluation_error=ctx.get("evaluation_error", ""),
summary=summary,
original_intent=pipeline_input.intent or pipeline_input.subject,
original_provider=pipeline_input.provider,
diff --git a/src/vulca/pipeline/nodes/evaluate.py b/src/vulca/pipeline/nodes/evaluate.py
index ff6d69e0..5c96f021 100644
--- a/src/vulca/pipeline/nodes/evaluate.py
+++ b/src/vulca/pipeline/nodes/evaluate.py
@@ -35,14 +35,17 @@ async def run(self, ctx: NodeContext) -> dict[str, Any]:
if not img_b64:
logger.warning("EvaluateNode: no image_b64 in context, using mock scores")
result = self._mock_scores(ctx)
- return self._merge_algo_scores(result, algo_scores, algo_covered_dims, weights)
+ merged = self._merge_algo_scores(result, algo_scores, algo_covered_dims, weights)
+ return self._apply_content_fidelity_gate(ctx, merged)
if VLM_SCORING not in provider_capabilities(ctx.provider) or not ctx.api_key:
result = self._mock_scores(ctx)
- return self._merge_algo_scores(result, algo_scores, algo_covered_dims, weights)
+ merged = self._merge_algo_scores(result, algo_scores, algo_covered_dims, weights)
+ return self._apply_content_fidelity_gate(ctx, merged)
result = await self._vlm_scores(ctx, img_b64, img_mime)
- return self._merge_algo_scores(result, algo_scores, algo_covered_dims, weights)
+ merged = self._merge_algo_scores(result, algo_scores, algo_covered_dims, weights)
+ return self._apply_content_fidelity_gate(ctx, merged)
@staticmethod
def _detect_algo_coverage(
@@ -151,6 +154,23 @@ def _get_weights(ctx: NodeContext) -> dict[str, float]:
from vulca.cultural import get_weights
return get_weights(ctx.tradition)
+ @staticmethod
+ def _apply_content_fidelity_gate(
+ ctx: NodeContext,
+ result: dict[str, Any],
+ ) -> dict[str, Any]:
+ node_params = ctx.get("node_params") or {}
+ eval_params = node_params.get("evaluate") or {}
+ gate = result.get("content_fidelity_gate") or eval_params.get(
+ "content_fidelity_gate"
+ )
+ if not gate:
+ return result
+
+ from vulca.content_lock import apply_content_fidelity_gate
+
+ return apply_content_fidelity_gate(result, gate)
+
@staticmethod
def _apply_locked_dimensions(
new_scores: dict[str, float],
@@ -206,6 +226,8 @@ async def _vlm_scores(
from vulca._vlm import score_image
eval_mode = ctx.get("eval_mode", "strict")
+ node_params = ctx.get("node_params") or {}
+ eval_params = node_params.get("evaluate") or {}
data = await score_image(
img_b64=img_b64,
@@ -214,19 +236,22 @@ async def _vlm_scores(
tradition=ctx.tradition,
api_key=ctx.api_key,
mode=eval_mode,
+ content_lock=eval_params.get("content_lock"),
)
# If VLM failed (quota/network error), fall back to mock scores
if data.get("error"):
logger.warning("VLM scoring failed, falling back to mock: %s", data["error"])
- return EvaluateNode._mock_scores(ctx)
+ fallback = EvaluateNode._mock_scores(ctx)
+ fallback["evaluation_source"] = "mock_fallback"
+ fallback["evaluation_error"] = str(data["error"])
+ return fallback
scores = {f"L{i}": data.get(f"L{i}", 0.0) for i in range(1, 6)}
rationales = {
f"L{i}_rationale": data.get(f"L{i}_rationale", "") for i in range(1, 6)
}
- node_params = ctx.get("node_params") or {}
locked_vlm: list[str] = (node_params.get("evaluate") or {}).get("locked_dimensions", [])
previous_vlm: dict[str, float] = ctx.get("scores") or {}
if locked_vlm and previous_vlm:
@@ -238,5 +263,9 @@ async def _vlm_scores(
return {
"scores": scores,
"rationales": rationales,
+ "risk_flags": data.get("risk_flags", []),
"weighted_total": round(weighted_total, 4),
+ "content_fidelity_gate": data.get("content_fidelity_gate"),
+ "evaluation_source": "vlm",
+ "evaluation_error": "",
}
diff --git a/src/vulca/pipeline/nodes/generate.py b/src/vulca/pipeline/nodes/generate.py
index cec9eb5c..89aa92a4 100644
--- a/src/vulca/pipeline/nodes/generate.py
+++ b/src/vulca/pipeline/nodes/generate.py
@@ -6,6 +6,7 @@
import base64
import hashlib
import logging
+import re
import time
from typing import Any
@@ -52,6 +53,10 @@
}
+def _looks_like_sample_id(value: str) -> bool:
+ return bool(re.fullmatch(r"[a-z][a-z0-9]*[_-]\d{2,}", value.strip(), re.IGNORECASE))
+
+
class GenerateNode(PipelineNode):
"""Generate an image from a text prompt via the Provider Registry.
@@ -91,6 +96,24 @@ async def _provider_generate(
from vulca.providers import get_image_provider
prompt = ctx.get("prompt") or ctx.subject or ctx.intent
+ node_params = ctx.get("node_params") or {}
+ gen_params = node_params.get("generate") or {}
+
+ content_lock_data = gen_params.get("content_lock")
+ if content_lock_data:
+ from vulca.content_lock import (
+ build_content_lock_prompt,
+ content_lock_from_dict,
+ )
+
+ content_lock = content_lock_from_dict(content_lock_data)
+ lock_prompt = build_content_lock_prompt(content_lock)
+ if lock_prompt:
+ original_prompt = ctx.intent or prompt
+ prompt = (
+ f"{lock_prompt}\n\n"
+ f"USER INTENT TO PRESERVE VERBATIM:\n{original_prompt}"
+ )
# Build extra kwargs with cultural guidance + improvement instructions
extra_kwargs: dict[str, Any] = {}
@@ -106,8 +129,6 @@ async def _provider_generate(
# Resolve reference image (top-level or node_params)
ref_b64 = ctx.get("reference_image_b64") or ""
- node_params = ctx.get("node_params") or {}
- gen_params = node_params.get("generate") or {}
if not ref_b64:
ref_b64 = gen_params.get("reference_image_b64", "")
@@ -140,11 +161,15 @@ async def _provider_generate(
provider_name, api_key=ctx.api_key
)
+ subject_for_provider = ctx.subject or ""
+ if content_lock_data and _looks_like_sample_id(subject_for_provider):
+ subject_for_provider = ""
+
result = await asyncio.wait_for(
provider_instance.generate(
prompt,
tradition=ctx.tradition,
- subject=ctx.subject or "",
+ subject=subject_for_provider,
reference_image_b64=ref_b64,
**extra_kwargs,
),
@@ -236,7 +261,10 @@ def _mock_generate(ctx: NodeContext) -> dict[str, Any]:
tradition = ctx.tradition or "default"
bg = GenerateNode._TRADITION_COLORS.get(tradition, "#5F8A50")
tradition_display = tradition.replace("_", " ").title()
- subject_display = (ctx.subject or "Untitled")[:50]
+ subject_value = ctx.subject or "Untitled"
+ if _looks_like_sample_id(subject_value):
+ subject_value = "Untitled"
+ subject_display = subject_value[:50]
# Escape XML special characters
for old, new in [("&", "&"), ("<", "<"), (">", ">"), ('"', """)]:
subject_display = subject_display.replace(old, new)
diff --git a/src/vulca/pipeline/types.py b/src/vulca/pipeline/types.py
index 70ff59f5..946f537d 100644
--- a/src/vulca/pipeline/types.py
+++ b/src/vulca/pipeline/types.py
@@ -140,6 +140,9 @@ class PipelineOutput:
total_cost_usd: float = 0.0
risk_flags: list[str] = field(default_factory=list)
recommendations: list[str] = field(default_factory=list)
+ content_fidelity_gate: dict[str, Any] = field(default_factory=dict)
+ evaluation_source: str = ""
+ evaluation_error: str = ""
interrupted_at: str = ""
summary: str = ""
# Preserved for HITL resume — original user inputs
@@ -165,6 +168,9 @@ def to_dict(self) -> dict[str, Any]:
"total_cost_usd": self.total_cost_usd,
"risk_flags": self.risk_flags,
"recommendations": self.recommendations,
+ "content_fidelity_gate": self.content_fidelity_gate,
+ "evaluation_source": self.evaluation_source,
+ "evaluation_error": self.evaluation_error,
"interrupted_at": self.interrupted_at,
"summary": self.summary,
"original_intent": self.original_intent,
diff --git a/src/vulca/providers/gemini.py b/src/vulca/providers/gemini.py
index e9206854..82c6db76 100644
--- a/src/vulca/providers/gemini.py
+++ b/src/vulca/providers/gemini.py
@@ -81,6 +81,15 @@ def _build_visible_mask_reference(mask_bytes: bytes) -> bytes:
return buf.getvalue()
+def _iter_image_parts(response: object):
+ candidates = getattr(response, "candidates", None) or []
+ for candidate in candidates:
+ content = getattr(candidate, "content", None)
+ parts = getattr(content, "parts", None) or []
+ for part in parts:
+ yield part
+
+
class GeminiImageProvider:
"""Image generation via Google Gemini API.
@@ -217,19 +226,20 @@ async def _call() -> object:
retryable_check=_is_retryable,
)
- if response.candidates:
- for part in response.candidates[0].content.parts:
- if part.inline_data and part.inline_data.mime_type.startswith("image/"):
- img_b64 = base64.b64encode(part.inline_data.data).decode()
- return ImageResult(
- image_b64=img_b64,
- mime=part.inline_data.mime_type,
- metadata={
- "model": self.model,
- "image_size": image_size,
- "aspect_ratio": aspect_ratio,
- },
- )
+ for part in _iter_image_parts(response):
+ inline_data = getattr(part, "inline_data", None)
+ mime_type = getattr(inline_data, "mime_type", "")
+ if inline_data and mime_type.startswith("image/"):
+ img_b64 = base64.b64encode(inline_data.data).decode()
+ return ImageResult(
+ image_b64=img_b64,
+ mime=mime_type,
+ metadata={
+ "model": self.model,
+ "image_size": image_size,
+ "aspect_ratio": aspect_ratio,
+ },
+ )
# No image data — classify the failure before surfacing the error so
# users get an actionable remediation hint instead of a generic
@@ -375,20 +385,21 @@ async def _call() -> object:
retryable_check=_is_retryable,
)
- if response.candidates:
- for part in response.candidates[0].content.parts:
- if part.inline_data and part.inline_data.mime_type.startswith("image/"):
- img_b64 = base64.b64encode(part.inline_data.data).decode()
- return ImageResult(
- image_b64=img_b64,
- mime=part.inline_data.mime_type,
- metadata={
- "model": self.model,
- "mode": "gemini_mask_adapter",
- "image_size": image_size,
- "aspect_ratio": aspect_ratio,
- },
- )
+ for part in _iter_image_parts(response):
+ inline_data = getattr(part, "inline_data", None)
+ mime_type = getattr(inline_data, "mime_type", "")
+ if inline_data and mime_type.startswith("image/"):
+ img_b64 = base64.b64encode(inline_data.data).decode()
+ return ImageResult(
+ image_b64=img_b64,
+ mime=mime_type,
+ metadata={
+ "model": self.model,
+ "mode": "gemini_mask_adapter",
+ "image_size": image_size,
+ "aspect_ratio": aspect_ratio,
+ },
+ )
prompt_feedback = getattr(response, "prompt_feedback", None)
block_reason = getattr(prompt_feedback, "block_reason", None) if prompt_feedback else None
diff --git a/src/vulca/types.py b/src/vulca/types.py
index 4a237a0f..606f69c2 100644
--- a/src/vulca/types.py
+++ b/src/vulca/types.py
@@ -162,6 +162,18 @@ class CreateResult:
recommendations: list[str] = field(default_factory=list)
"""Actionable recommendations."""
+ risk_flags: list[str] = field(default_factory=list)
+ """Risk and gate flags from evaluation, e.g. content_fidelity_failed."""
+
+ content_fidelity_gate: dict = field(default_factory=dict)
+ """Content-lock/artifact-boundary audit fields used by the final score gate."""
+
+ evaluation_source: str = ""
+ """Scoring source for the final candidate, e.g. vlm, mock, or mock_fallback."""
+
+ evaluation_error: str = ""
+ """Non-empty when scoring fell back after a VLM or parser error."""
+
suggestions: dict[str, str] = field(default_factory=dict)
"""Per-dimension actionable suggestions (L1→suggestion text)."""
diff --git a/tests/test_cli_create_output.py b/tests/test_cli_create_output.py
index 6a334cea..3597fc1c 100644
--- a/tests/test_cli_create_output.py
+++ b/tests/test_cli_create_output.py
@@ -5,6 +5,7 @@
import subprocess
import sys
from pathlib import Path
+from unittest.mock import patch
import pytest
@@ -78,3 +79,41 @@ def test_create_image_is_valid_png(self, tmp_path):
# PNG magic bytes: 89 50 4E 47
data = out_file.read_bytes()
assert data[:4] == b'\x89PNG' or len(data) > 0, "File should be valid PNG or non-empty image"
+
+ def test_create_cli_accepts_content_lock_flag(self, capsys):
+ """create --content-lock should pass content_lock=True to the API."""
+ from vulca.cli import main
+ from vulca.types import CreateResult
+
+ with patch("vulca.create", return_value=CreateResult(session_id="s1")) as mock_create:
+ main([
+ "create",
+ "Ink and wash painting of bamboo beside calligraphy.",
+ "--content-lock",
+ "--provider",
+ "mock",
+ "--json",
+ ])
+
+ captured = capsys.readouterr()
+ assert '"session_id": "s1"' in captured.out
+ assert mock_create.call_args.kwargs["content_lock"] is True
+
+ def test_create_cli_accepts_output_is_artwork_itself_flag(self, capsys):
+ """create --output-is-artwork-itself should pass the artifact-boundary flag."""
+ from vulca.cli import main
+ from vulca.types import CreateResult
+
+ with patch("vulca.create", return_value=CreateResult(session_id="s1")) as mock_create:
+ main([
+ "create",
+ "Socialist Realism propaganda poster with workers.",
+ "--output-is-artwork-itself",
+ "--provider",
+ "mock",
+ "--json",
+ ])
+
+ captured = capsys.readouterr()
+ assert '"session_id": "s1"' in captured.out
+ assert mock_create.call_args.kwargs["output_is_artwork_itself"] is True
diff --git a/tests/test_content_lock.py b/tests/test_content_lock.py
new file mode 100644
index 00000000..e510b0ef
--- /dev/null
+++ b/tests/test_content_lock.py
@@ -0,0 +1,624 @@
+from __future__ import annotations
+
+from vulca.content_lock import (
+ ContentLock,
+ apply_content_fidelity_gate,
+ build_blind_relation_gate,
+ build_blind_relation_read_prompt,
+ build_content_fidelity_gate,
+ build_content_fidelity_prompt,
+ build_content_lock_prompt,
+ extract_content_lock,
+)
+
+
+def test_extracts_required_subjects_and_attributes_from_caption():
+ lock = extract_content_lock(
+ "Ink and wash painting of delicate bamboo and orchid grasses beside "
+ "vertical Chinese calligraphy and red seals on aged paper, with sparse "
+ "brushwork and a calm scholarly composition."
+ )
+
+ assert lock.required_subjects == ["bamboo", "orchid grasses"]
+ assert lock.required_text_elements == ["vertical Chinese calligraphy", "red seals"]
+ assert lock.required_surface == ["aged paper"]
+ assert "sparse brushwork" in lock.required_style_attributes
+ assert "calm scholarly composition" in lock.required_mood
+
+
+def test_extracts_generic_required_subjects_from_caption():
+ lock = extract_content_lock(
+ "Editorial illustration of a silver astronaut, cracked moon rover, and "
+ "orange emergency flare under a black sky."
+ )
+
+ assert lock.required_subjects == [
+ "silver astronaut",
+ "cracked moon rover",
+ "orange emergency flare",
+ ]
+
+
+def test_extracts_declarative_graph_paper_branching_caption():
+ lock = extract_content_lock(
+ "Abstract hand-drawn branching lines fill a rectangular frame on graph "
+ "paper, forming a dense tree-like network with a small heart and "
+ "geometric marks in monochrome pencil style."
+ )
+
+ assert "hand-drawn branching lines" in lock.required_subjects
+ assert "dense tree-like network" in lock.required_subjects
+ assert "small heart" in lock.required_subjects
+ assert "geometric marks" in lock.required_subjects
+ assert lock.required_surface == ["graph paper"]
+ assert "rectangular frame" in lock.required_style_attributes
+ assert "monochrome pencil style" in lock.required_style_attributes
+
+
+def test_extracts_gongbi_album_leaf_subjects_and_format():
+ lock = extract_content_lock(
+ "Gongbi album leaf with a small bird perched beside sparse branches, "
+ "a circular calligraphy panel, and an ornate pale patterned border."
+ )
+
+ assert "small bird" in lock.required_subjects
+ assert "sparse branches" in lock.required_subjects
+ assert "circular calligraphy panel" in lock.required_text_elements
+ assert "ornate pale patterned border" in lock.required_surface
+ assert "Gongbi album leaf" in lock.required_style_attributes
+
+
+def test_extracts_relation_semantics_for_escort_evacuation_caption():
+ lock = extract_content_lock(
+ "Wartime illustration of mounted soldiers beside fleeing civilians, "
+ "burning village ruins, and aircraft overhead."
+ )
+
+ assert "mounted soldiers" in lock.required_subjects
+ assert "fleeing civilians" in lock.required_subjects
+ assert "burning village ruins" in lock.required_subjects
+ assert "aircraft overhead" in lock.required_subjects
+ assert lock.required_relations == [
+ {
+ "subject": "mounted soldiers",
+ "relation": "escort/protect",
+ "object": "fleeing civilians",
+ },
+ {
+ "subject": "fleeing civilians",
+ "relation": "evacuate_from",
+ "object": "burning village ruins",
+ },
+ {
+ "subject": "aircraft overhead",
+ "relation": "overhead_threat_or_wartime_context",
+ "object": "scene",
+ },
+ ]
+ assert "soldiers chasing civilians" in lock.forbidden_readings
+ assert "escort/protect" in lock.composition_intent
+
+
+def test_extracts_relation_semantics_with_modifier_between_mounted_and_soldiers():
+ lock = extract_content_lock(
+ "A Socialist Realism poster with mounted Soviet soldiers escorting and "
+ "protecting civilians as they flee burning village ruins, aircraft overhead."
+ )
+
+ assert "mounted soldiers" in lock.required_subjects
+ assert "fleeing civilians" in lock.required_subjects
+ assert "aircraft overhead" in lock.required_subjects
+ assert lock.required_relations[0] == {
+ "subject": "mounted soldiers",
+ "relation": "escort/protect",
+ "object": "fleeing civilians",
+ }
+ assert "soldiers chasing civilians" in lock.forbidden_readings
+
+
+def test_content_lock_prompt_makes_subjects_non_negotiable():
+ lock = extract_content_lock(
+ "Ink and wash painting of delicate bamboo and orchid grasses beside "
+ "vertical Chinese calligraphy and red seals on aged paper."
+ )
+
+ prompt = build_content_lock_prompt(lock)
+
+ assert "NON-NEGOTIABLE CONTENT REQUIREMENTS" in prompt
+ assert "bamboo" in prompt
+ assert "orchid grasses" in prompt
+ assert "vertical Chinese calligraphy" in prompt
+ assert "red seals" in prompt
+ assert (
+ "Do not replace these subjects with mountains, generic landscapes, "
+ "or unrelated tradition prototypes."
+ ) in prompt
+
+
+def test_content_lock_prompt_bans_visible_ids_and_gallery_artifacts():
+ lock = extract_content_lock(
+ "Abstract hand-drawn branching lines fill a rectangular frame on graph "
+ "paper in monochrome pencil style."
+ )
+
+ prompt = build_content_lock_prompt(lock)
+
+ assert "sample IDs" in prompt
+ assert "gallery" in prompt.lower()
+ assert "large labels" in prompt
+
+
+def test_content_lock_prompt_makes_relations_non_negotiable():
+ lock = extract_content_lock(
+ "Wartime illustration of mounted soldiers beside fleeing civilians, "
+ "burning village ruins, and aircraft overhead."
+ )
+
+ prompt = build_content_lock_prompt(lock)
+
+ assert "RELATION SEMANTICS REQUIREMENTS" in prompt
+ assert "mounted soldiers must read as escort/protect fleeing civilians" in prompt
+ assert "fleeing civilians must read as evacuate_from burning village ruins" in prompt
+ assert "COMPOSITION INTENT" in prompt
+ assert "FORBIDDEN RELATION READINGS" in prompt
+ assert "soldiers chasing civilians" in prompt
+
+
+def test_blind_relation_prompt_does_not_anchor_on_caption_or_forbidden_reading():
+ lock = extract_content_lock(
+ "Wartime illustration of mounted soldiers beside fleeing civilians, "
+ "burning village ruins, and aircraft overhead."
+ )
+
+ prompt = build_blind_relation_read_prompt(lock)
+
+ assert "caption" not in prompt.lower()
+ assert "escort" not in prompt.lower()
+ assert "protect" not in prompt.lower()
+ assert "soldiers chasing civilians" not in prompt.lower()
+ assert "visible relationships" in prompt.lower()
+
+
+def test_blind_relation_gate_rejects_forbidden_primary_reading():
+ lock = extract_content_lock(
+ "Wartime illustration of mounted soldiers beside fleeing civilians, "
+ "burning village ruins, and aircraft overhead."
+ )
+
+ gate = build_blind_relation_gate(
+ lock,
+ {
+ "primary_reading": "Mounted soldiers appear to chase fleeing civilians.",
+ "apparent_relations": ["mounted soldiers chasing civilians"],
+ "ambiguous_readings": [],
+ },
+ )
+
+ assert gate["blind_relation_decision"] == "reject"
+ assert "soldiers chasing civilians" in gate["blind_forbidden_readings_present"]
+
+
+def test_blind_relation_gate_rejects_weapon_threat_to_civilians():
+ lock = extract_content_lock(
+ "Wartime illustration of mounted soldiers beside fleeing civilians, "
+ "burning village ruins, and aircraft overhead."
+ )
+
+ gate = build_blind_relation_gate(
+ lock,
+ {
+ "primary_reading": (
+ "Soldiers on horseback charge forward with drawn swords past "
+ "fleeing civilians."
+ ),
+ "apparent_relations": [
+ "soldiers brandish swords",
+ "civilians flee from fire",
+ ],
+ "ambiguous_readings": [],
+ },
+ )
+
+ assert gate["blind_relation_decision"] == "reject"
+ assert "civilians threatened by soldiers" in gate[
+ "blind_forbidden_readings_present"
+ ]
+
+
+def test_blind_relation_gate_rejects_weapon_threat_even_with_ambiguity():
+ lock = extract_content_lock(
+ "Wartime illustration of mounted soldiers beside fleeing civilians, "
+ "burning village ruins, and aircraft overhead."
+ )
+
+ gate = build_blind_relation_gate(
+ lock,
+ {
+ "primary_reading": (
+ "Soldiers on horseback charge forward with drawn swords past "
+ "fleeing civilians."
+ ),
+ "apparent_relations": [
+ "soldiers brandish swords",
+ "civilians flee from fire",
+ ],
+ "ambiguous_readings": [
+ "soldiers could be arriving to defend or passing through"
+ ],
+ },
+ )
+
+ assert gate["blind_relation_decision"] == "reject"
+ assert "civilians threatened by soldiers" in gate[
+ "blind_forbidden_readings_present"
+ ]
+
+
+def test_blind_relation_gate_holds_ambiguous_relation_reading():
+ lock = extract_content_lock(
+ "Wartime illustration of mounted soldiers beside fleeing civilians, "
+ "burning village ruins, and aircraft overhead."
+ )
+
+ gate = build_blind_relation_gate(
+ lock,
+ {
+ "primary_reading": "The riders could be escorting or pursuing the civilians.",
+ "apparent_relations": ["riders behind fleeing civilians"],
+ "ambiguous_readings": ["escort or pursuit"],
+ },
+ )
+
+ assert gate["blind_relation_decision"] == "hold"
+ assert gate["blind_ambiguous_readings"] == ["escort or pursuit"]
+
+
+def test_blind_relation_gate_passes_clear_escort_reading():
+ lock = extract_content_lock(
+ "Wartime illustration of mounted soldiers beside fleeing civilians, "
+ "burning village ruins, and aircraft overhead."
+ )
+
+ gate = build_blind_relation_gate(
+ lock,
+ {
+ "primary_reading": (
+ "Mounted soldiers flank civilians and guide them away from burning ruins."
+ ),
+ "apparent_relations": [
+ "mounted soldiers guiding civilians away from burning ruins"
+ ],
+ "ambiguous_readings": [],
+ },
+ )
+
+ assert gate["blind_relation_decision"] == "pass"
+
+
+def test_artifact_boundary_prompt_for_poster_requires_flat_artwork_surface():
+ lock = ContentLock(
+ original_intent="Socialist Realism propaganda poster with workers and red banners.",
+ output_is_artwork_itself=True,
+ )
+
+ prompt = build_content_lock_prompt(lock)
+
+ assert "ARTIFACT BOUNDARY REQUIREMENT" in prompt
+ assert "artwork itself" in prompt
+ assert "flat, front-facing propaganda poster artwork" in prompt
+ assert "poster hanging on a wall" in prompt
+
+
+def test_artifact_boundary_prompt_for_scroll_rejects_catalog_displays():
+ lock = ContentLock(
+ original_intent="A Gongbi vertical hanging scroll with lotus blossoms.",
+ output_is_artwork_itself=True,
+ )
+
+ prompt = build_content_lock_prompt(lock)
+
+ assert "scroll/album-leaf artwork as the primary image surface" in prompt
+ assert "catalog spread" in prompt
+ assert "framed display" in prompt
+
+
+def test_content_fidelity_prompt_requests_missing_elements():
+ lock = extract_content_lock(
+ "Ink and wash painting of bamboo beside vertical Chinese calligraphy."
+ )
+
+ prompt = build_content_fidelity_prompt(lock)
+
+ assert "CONTENT FIDELITY CHECK" in prompt
+ assert "missing_required_subjects" in prompt
+ assert "missing_required_text_elements" in prompt
+ assert "bamboo" in prompt
+ assert "vertical Chinese calligraphy" in prompt
+ assert "forbidden_visual_artifacts" in prompt
+ assert "output_is_artwork_itself" in prompt
+ assert "unwanted_visible_text" in prompt
+
+
+def test_content_fidelity_prompt_requests_relation_semantics_fields():
+ lock = extract_content_lock(
+ "Wartime illustration of mounted soldiers beside fleeing civilians, "
+ "burning village ruins, and aircraft overhead."
+ )
+
+ prompt = build_content_fidelity_prompt(lock)
+
+ assert "Required relations" in prompt
+ assert "Forbidden relation readings" in prompt
+ assert "apparent_relations" in prompt
+ assert "relation_semantics_failed" in prompt
+ assert "forbidden_readings_present" in prompt
+
+
+def test_content_fidelity_prompt_requests_missing_style_attributes():
+ lock = extract_content_lock(
+ "Abstract hand-drawn branching lines fill a rectangular frame on graph "
+ "paper in monochrome pencil style."
+ )
+
+ prompt = build_content_fidelity_prompt(lock)
+
+ assert "missing_required_style_attributes" in prompt
+ assert "rectangular frame" in prompt
+ assert "monochrome pencil style" in prompt
+
+
+def test_missing_required_subject_caps_high_score():
+ result = {
+ "scores": {"L1": 0.95, "L2": 0.92, "L3": 1.0, "L4": 1.0, "L5": 0.94},
+ "weighted_total": 0.965,
+ "rationales": {},
+ }
+ gate = {
+ "required_subjects": ["bamboo", "orchid grasses"],
+ "missing_required_subjects": ["bamboo", "orchid grasses"],
+ }
+
+ gated = apply_content_fidelity_gate(result, gate)
+
+ assert gated["weighted_total"] == 0.25
+ assert gated["scores"]["L3"] <= 0.25
+ assert "content_fidelity_failed" in gated["risk_flags"]
+ assert (
+ "Missing required subjects: bamboo, orchid grasses"
+ in gated["rationales"]["content_fidelity"]
+ )
+
+
+def test_missing_required_text_element_caps_high_score():
+ result = {
+ "scores": {"L1": 0.95, "L2": 0.92, "L3": 1.0, "L4": 1.0, "L5": 0.94},
+ "weighted_total": 0.965,
+ "rationales": {},
+ }
+ gate = {
+ "required_text_elements": ["vertical Chinese calligraphy"],
+ "missing_required_text_elements": ["vertical Chinese calligraphy"],
+ }
+
+ gated = apply_content_fidelity_gate(result, gate)
+
+ assert gated["weighted_total"] == 0.25
+ assert "content_fidelity_failed" in gated["risk_flags"]
+ assert "Missing required text elements" in gated["rationales"]["content_fidelity"]
+
+
+def test_missing_required_style_attribute_caps_high_score():
+ result = {
+ "scores": {"L1": 0.95, "L2": 0.92, "L3": 1.0, "L4": 1.0, "L5": 0.94},
+ "weighted_total": 0.965,
+ "rationales": {},
+ }
+ gate = {
+ "required_style_attributes": ["monochrome pencil style"],
+ "missing_required_style_attributes": ["monochrome pencil style"],
+ }
+
+ gated = apply_content_fidelity_gate(result, gate)
+
+ assert gated["weighted_total"] == 0.25
+ assert "content_fidelity_failed" in gated["risk_flags"]
+ assert "Missing required style attributes" in gated["rationales"]["content_fidelity"]
+
+
+def test_forbidden_visual_artifact_caps_high_score():
+ result = {
+ "scores": {"L1": 0.95, "L2": 0.92, "L3": 1.0, "L4": 1.0, "L5": 0.94},
+ "weighted_total": 0.965,
+ "rationales": {},
+ }
+ gate = {
+ "forbidden_visual_artifacts": ["visible sample ID", "gallery photo mockup"],
+ }
+
+ gated = apply_content_fidelity_gate(result, gate)
+
+ assert gated["weighted_total"] == 0.25
+ assert "content_fidelity_failed" in gated["risk_flags"]
+ assert "Forbidden visual artifacts" in gated["rationales"]["content_fidelity"]
+
+
+def test_artifact_boundary_violation_caps_high_score():
+ result = {
+ "scores": {"L1": 0.95, "L2": 0.92, "L3": 1.0, "L4": 1.0, "L5": 0.94},
+ "weighted_total": 0.965,
+ "rationales": {},
+ }
+ gate = {
+ "required_output_is_artwork_itself": True,
+ "output_is_artwork_itself": False,
+ "unwanted_visible_text": True,
+ }
+
+ gated = apply_content_fidelity_gate(result, gate)
+
+ assert gated["weighted_total"] == 0.25
+ assert "content_fidelity_failed" in gated["risk_flags"]
+ assert "Output is not the artwork itself" in gated["rationales"]["content_fidelity"]
+ assert "Unwanted visible text" in gated["rationales"]["content_fidelity"]
+
+
+def test_content_fidelity_gate_reads_artifact_boundary_fields():
+ lock = ContentLock(
+ original_intent="Graph-paper branching pencil drawing.",
+ output_is_artwork_itself=True,
+ )
+
+ gate = build_content_fidelity_gate(
+ lock,
+ {
+ "forbidden_visual_artifacts": ["gallery wall"],
+ "unwanted_visible_text": True,
+ "output_is_artwork_itself": False,
+ },
+ )
+
+ assert gate["required_output_is_artwork_itself"] is True
+ assert gate["output_is_artwork_itself"] is False
+ assert gate["unwanted_visible_text"] is True
+ assert gate["forbidden_visual_artifacts"] == ["gallery wall"]
+
+
+def test_content_fidelity_gate_infers_modern_ui_text_artifacts():
+ lock = extract_content_lock(
+ "A Socialist Realism propaganda poster of a triumphant armored rider "
+ "on horseback beside Soviet soldiers with rifles, fallen enemies in "
+ "the foreground, and a fortified city rising in the background under "
+ "bold Cyrillic lettering."
+ )
+
+ gate = build_content_fidelity_gate(
+ lock,
+ {
+ "apparent_relations": [
+ "rider-leads-soldiers",
+ "memes-overlay-history",
+ "text-labels-concepts",
+ "social icons frame the city",
+ ],
+ "forbidden_visual_artifacts": [],
+ "unwanted_visible_text": False,
+ "output_is_artwork_itself": True,
+ },
+ )
+
+ assert gate["unwanted_visible_text"] is True
+ assert "modern UI/collage artifacts" in gate["forbidden_visual_artifacts"]
+ assert "unrequested visible text labels" in gate["forbidden_visual_artifacts"]
+
+
+def test_content_fidelity_gate_rejects_english_labels_even_with_allowed_cyrillic():
+ lock = extract_content_lock(
+ "A Socialist Realism propaganda poster of a triumphant armored rider "
+ "under bold Cyrillic lettering."
+ )
+
+ gate = build_content_fidelity_gate(
+ lock,
+ {
+ "apparent_relations": [
+ "bold Cyrillic headline spans the poster",
+ "English text labels explain concepts in the city",
+ ],
+ "forbidden_visual_artifacts": [],
+ "unwanted_visible_text": False,
+ "output_is_artwork_itself": True,
+ },
+ )
+
+ assert gate["unwanted_visible_text"] is True
+ assert "unrequested visible text labels" in gate["forbidden_visual_artifacts"]
+
+
+def test_content_fidelity_gate_reads_relation_semantics_fields():
+ lock = extract_content_lock(
+ "Wartime illustration of mounted soldiers beside fleeing civilians, "
+ "burning village ruins, and aircraft overhead."
+ )
+
+ gate = build_content_fidelity_gate(
+ lock,
+ {
+ "apparent_relations": ["mounted soldiers appear to chase civilians"],
+ "relation_semantics_failed": True,
+ "forbidden_readings_present": ["soldiers chasing civilians"],
+ },
+ )
+
+ assert gate["required_relations"] == lock.required_relations
+ assert gate["apparent_relations"] == ["mounted soldiers appear to chase civilians"]
+ assert gate["relation_semantics_failed"] is True
+ assert gate["forbidden_readings"] == lock.forbidden_readings
+ assert gate["forbidden_readings_present"] == ["soldiers chasing civilians"]
+
+
+def test_relation_semantics_failure_caps_high_score():
+ result = {
+ "scores": {"L1": 0.95, "L2": 0.92, "L3": 1.0, "L4": 1.0, "L5": 0.94},
+ "weighted_total": 0.965,
+ "rationales": {},
+ }
+ gate = {
+ "required_relations": [
+ {
+ "subject": "mounted soldiers",
+ "relation": "escort/protect",
+ "object": "fleeing civilians",
+ }
+ ],
+ "relation_semantics_failed": True,
+ "forbidden_readings_present": ["soldiers chasing civilians"],
+ }
+
+ gated = apply_content_fidelity_gate(result, gate)
+
+ assert gated["weighted_total"] == 0.25
+ assert gated["scores"]["L4"] <= 0.25
+ assert "content_fidelity_failed" in gated["risk_flags"]
+ assert "Relation semantics failed" in gated["rationales"]["content_fidelity"]
+ assert "Forbidden relation readings: soldiers chasing civilians" in gated["rationales"]["content_fidelity"]
+
+
+def test_blind_relation_reject_caps_high_score():
+ result = {
+ "scores": {"L1": 0.95, "L2": 0.92, "L3": 1.0, "L4": 1.0, "L5": 0.94},
+ "weighted_total": 0.965,
+ "rationales": {},
+ }
+ gate = {
+ "blind_relation_decision": "reject",
+ "blind_relation_reason": "blind read matches forbidden relation reading",
+ "blind_forbidden_readings_present": ["soldiers chasing civilians"],
+ "blind_primary_reading": "Mounted soldiers appear to chase civilians.",
+ }
+
+ gated = apply_content_fidelity_gate(result, gate)
+
+ assert gated["weighted_total"] == 0.25
+ assert (
+ "Blind relation gate rejected image"
+ in gated["rationales"]["content_fidelity"]
+ )
+ assert "content_fidelity_failed" in gated["risk_flags"]
+
+
+def test_present_required_subjects_do_not_cap_score():
+ result = {
+ "scores": {"L1": 0.95, "L2": 0.92, "L3": 1.0, "L4": 1.0, "L5": 0.94},
+ "weighted_total": 0.965,
+ "rationales": {},
+ }
+ gate = {
+ "required_subjects": ["bamboo", "orchid grasses"],
+ "missing_required_subjects": [],
+ }
+
+ gated = apply_content_fidelity_gate(result, gate)
+
+ assert gated["weighted_total"] == 0.965
+ assert gated["scores"]["L3"] == 1.0
+ assert gated.get("risk_flags", []) == []
diff --git a/tests/test_create_hitl.py b/tests/test_create_hitl.py
index 86f9e7f1..7b885bec 100644
--- a/tests/test_create_hitl.py
+++ b/tests/test_create_hitl.py
@@ -2,9 +2,13 @@
from __future__ import annotations
+import asyncio
+from unittest.mock import AsyncMock, patch
+
import pytest
-from vulca.create import acreate, create
+from vulca.create import _create_local, acreate, create
+from vulca.pipeline.types import PipelineOutput
from vulca.types import CreateResult
@@ -36,6 +40,53 @@ def test_hitl_sync(self):
assert result.status == "waiting_human"
assert result.interrupted_at == "decide"
+ def test_create_accepts_content_lock_argument(self):
+ result = create(
+ "Ink and wash painting of bamboo beside calligraphy.",
+ provider="mock",
+ mode="local",
+ content_lock=True,
+ )
+
+ assert result.status == "completed"
+
+ def test_create_accepts_output_is_artwork_itself_argument(self):
+ result = create(
+ "Socialist Realism propaganda poster with workers.",
+ provider="mock",
+ mode="local",
+ output_is_artwork_itself=True,
+ )
+
+ assert result.status == "completed"
+
+ def test_create_local_exposes_content_fidelity_audit_fields(self):
+ output = PipelineOutput(
+ session_id="s1",
+ status="completed",
+ final_scores={"L1": 0.25},
+ weighted_total=0.25,
+ risk_flags=["content_fidelity_failed"],
+ content_fidelity_gate={
+ "forbidden_visual_artifacts": ["visible sample ID"],
+ "unwanted_visible_text": True,
+ "output_is_artwork_itself": False,
+ },
+ evaluation_source="mock_fallback",
+ evaluation_error="Could not parse JSON from LLM output",
+ )
+
+ with patch("vulca.pipeline.engine.execute", new=AsyncMock(return_value=output)):
+ result = asyncio.run(_create_local("test artwork", provider="mock"))
+
+ assert result.risk_flags == ["content_fidelity_failed"]
+ assert result.content_fidelity_gate["forbidden_visual_artifacts"] == [
+ "visible sample ID"
+ ]
+ assert result.evaluation_source == "mock_fallback"
+ assert result.evaluation_error == "Could not parse JSON from LLM output"
+ assert result.raw["content_fidelity_gate"] == result.content_fidelity_gate
+
class TestCreateWeights:
"""Custom weights change the weighted_total."""
diff --git a/tests/test_evaluate.py b/tests/test_evaluate.py
index d3e5fbcb..c2efb803 100644
--- a/tests/test_evaluate.py
+++ b/tests/test_evaluate.py
@@ -405,3 +405,142 @@ def test_eval_result_with_skills_to_dict(self):
)
d = asdict(result)
assert d["skills"]["brand"]["score"] == 0.8
+
+
+def test_evaluate_node_applies_vlm_content_fidelity_gate():
+ from vulca.content_lock import extract_content_lock
+ from vulca.pipeline.node import NodeContext
+ from vulca.pipeline.nodes import EvaluateNode
+
+ lock = extract_content_lock(
+ "Ink and wash painting of bamboo beside vertical Chinese calligraphy."
+ )
+ ctx = NodeContext(
+ subject="track1_0002",
+ intent=lock.original_intent,
+ tradition="chinese_xieyi",
+ provider="gemini",
+ api_key="fake-key",
+ )
+ ctx.set("image_b64", "iVBORw0KGgo=")
+ ctx.set("node_params", {"evaluate": {"content_lock": lock.to_dict()}})
+
+ scored = {
+ "L1": 0.95,
+ "L2": 0.92,
+ "L3": 1.0,
+ "L4": 1.0,
+ "L5": 0.94,
+ "L1_rationale": "Strong image.",
+ "L2_rationale": "Strong technique.",
+ "L3_rationale": "Strong style.",
+ "L4_rationale": "Respectful.",
+ "L5_rationale": "Poetic.",
+ "content_fidelity_gate": {
+ "required_subjects": ["bamboo"],
+ "missing_required_subjects": ["bamboo"],
+ "required_text_elements": ["vertical Chinese calligraphy"],
+ "missing_required_text_elements": [],
+ },
+ }
+
+ with patch("vulca._vlm.score_image", new=AsyncMock(return_value=scored)) as mock_score:
+ result = asyncio.run(EvaluateNode().run(ctx))
+
+ assert mock_score.await_args.kwargs["content_lock"] == lock.to_dict()
+ assert result["weighted_total"] == 0.25
+ assert result["scores"]["L3"] == 0.25
+ assert "content_fidelity_failed" in result["risk_flags"]
+
+
+def test_evaluate_node_applies_vlm_relation_semantics_gate():
+ from vulca.content_lock import extract_content_lock
+ from vulca.pipeline.node import NodeContext
+ from vulca.pipeline.nodes import EvaluateNode
+
+ lock = extract_content_lock(
+ "Wartime illustration of mounted soldiers beside fleeing civilians, "
+ "burning village ruins, and aircraft overhead."
+ )
+ ctx = NodeContext(
+ subject="track1_0747",
+ intent=lock.original_intent,
+ tradition="default",
+ provider="gemini",
+ api_key="fake-key",
+ )
+ ctx.set("image_b64", "iVBORw0KGgo=")
+ ctx.set("node_params", {"evaluate": {"content_lock": lock.to_dict()}})
+
+ scored = {
+ "L1": 0.95,
+ "L2": 0.92,
+ "L3": 1.0,
+ "L4": 1.0,
+ "L5": 0.94,
+ "L1_rationale": "Strong image.",
+ "L2_rationale": "Strong technique.",
+ "L3_rationale": "Strong style.",
+ "L4_rationale": "Respectful.",
+ "L5_rationale": "Poetic.",
+ "content_fidelity_gate": {
+ "required_relations": lock.required_relations,
+ "apparent_relations": ["mounted soldiers appear to chase civilians"],
+ "relation_semantics_failed": True,
+ "forbidden_readings_present": ["soldiers chasing civilians"],
+ },
+ }
+
+ with patch("vulca._vlm.score_image", new=AsyncMock(return_value=scored)):
+ result = asyncio.run(EvaluateNode().run(ctx))
+
+ assert result["weighted_total"] == 0.25
+ assert result["scores"]["L4"] == 0.25
+ assert "content_fidelity_failed" in result["risk_flags"]
+ assert "Relation semantics failed" in result["rationales"]["content_fidelity"]
+
+
+def test_evaluate_node_marks_vlm_parse_fallback_explicitly():
+ from vulca.pipeline.node import NodeContext
+ from vulca.pipeline.nodes import EvaluateNode
+
+ ctx = NodeContext(
+ subject="track1_0064",
+ intent="A Gongbi vertical hanging scroll with lotus blossoms.",
+ tradition="chinese_gongbi",
+ provider="gemini",
+ api_key="fake-key",
+ )
+ ctx.set("image_b64", "iVBORw0KGgo=")
+
+ scored = {
+ "error": "Could not parse JSON from LLM output",
+ "L1": 0.0,
+ "L2": 0.0,
+ "L3": 0.0,
+ "L4": 0.0,
+ "L5": 0.0,
+ }
+
+ with patch("vulca._vlm.score_image", new=AsyncMock(return_value=scored)):
+ result = asyncio.run(EvaluateNode().run(ctx))
+
+ assert result["evaluation_source"] == "mock_fallback"
+ assert result["evaluation_error"] == "Could not parse JSON from LLM output"
+
+
+def test_extract_scoring_falls_back_to_first_json_after_scratchpad():
+ from vulca._parse import parse_llm_json
+ from vulca._vlm import _extract_scoring
+
+ raw = """**Phase 1 - Scratchpad**
+
+The model ignored the requested subject. This note includes braces {not json}.
+
+{"L1": 0.2, "L2": 0.2, "L3": 0.1, "L4": 0.1, "L5": 0.2}
+"""
+
+ scoring = _extract_scoring(raw)
+ parsed = parse_llm_json(scoring)
+
+ assert parsed["L3"] == 0.1
diff --git a/tests/test_gemini_image_size.py b/tests/test_gemini_image_size.py
index ec187fa4..8112efdd 100644
--- a/tests/test_gemini_image_size.py
+++ b/tests/test_gemini_image_size.py
@@ -5,6 +5,7 @@
import sys
import types as py_types
+import pytest
from PIL import Image
from vulca.providers.gemini import (
@@ -115,6 +116,43 @@ def test_declares_masked_edit_adapter_capabilities(self):
assert caps.requires_mask_for_edits is True
assert caps.supports_unmasked_edits is False
+ def test_generate_missing_candidate_parts_reports_no_image_data(self, monkeypatch):
+ class FakeImageConfig:
+ def __init__(self, **kwargs):
+ self.kwargs = kwargs
+
+ class FakeGenerateContentConfig:
+ def __init__(self, **kwargs):
+ self.kwargs = kwargs
+
+ class FakeModels:
+ def generate_content(self, *, model, contents, config):
+ return py_types.SimpleNamespace(
+ candidates=[
+ py_types.SimpleNamespace(
+ content=py_types.SimpleNamespace(parts=None)
+ )
+ ],
+ prompt_feedback=None,
+ )
+
+ class FakeClient:
+ def __init__(self, api_key):
+ self.models = FakeModels()
+
+ fake_types = py_types.SimpleNamespace(
+ ImageConfig=FakeImageConfig,
+ GenerateContentConfig=FakeGenerateContentConfig,
+ )
+ fake_genai = py_types.SimpleNamespace(Client=FakeClient, types=fake_types)
+ fake_google = py_types.SimpleNamespace(genai=fake_genai)
+ monkeypatch.setitem(sys.modules, "google", fake_google)
+ monkeypatch.setitem(sys.modules, "google.genai", fake_genai)
+ monkeypatch.setitem(sys.modules, "google.genai.types", fake_types)
+
+ with pytest.raises(RuntimeError, match="Gemini returned no image data"):
+ asyncio.run(GeminiImageProvider(api_key="gemini-key").generate("test"))
+
def test_inpaint_with_mask_sends_source_and_mask_parts(
self,
tmp_path,
diff --git a/tests/test_package.py b/tests/test_package.py
index 4c100274..026d4b70 100644
--- a/tests/test_package.py
+++ b/tests/test_package.py
@@ -423,6 +423,16 @@ def test_parse_llm_json_trailing_comma():
assert result == {"a": 1, "b": 2}
+def test_parse_llm_json_repairs_extra_quote_before_key():
+ from vulca._parse import parse_llm_json
+
+ text = '{"L5": 0.75, ""missing_required_subjects": [], "risk_flags": []}'
+ result = parse_llm_json(text)
+
+ assert result["L5"] == 0.75
+ assert result["missing_required_subjects"] == []
+
+
def test_parse_llm_json_invalid():
from vulca._parse import parse_llm_json
diff --git a/tests/test_pipeline_engine.py b/tests/test_pipeline_engine.py
index 17f3401b..60b114be 100644
--- a/tests/test_pipeline_engine.py
+++ b/tests/test_pipeline_engine.py
@@ -2,6 +2,9 @@
from __future__ import annotations
+import base64
+import asyncio
+
import pytest
from vulca.pipeline.node import NodeContext, PipelineNode
@@ -100,6 +103,174 @@ async def test_mock_different_rounds(self):
r2 = await node.run(ctx2)
assert r1["candidate_id"] != r2["candidate_id"]
+ def test_mock_generate_suppresses_sample_id_text(self):
+ node = GenerateNode()
+ ctx = NodeContext(subject="track1_0301", intent="draw branching lines")
+
+ result = node._mock_generate(ctx)
+ svg = base64.b64decode(result["image_b64"]).decode()
+
+ assert "track1_0301" not in svg
+
+ def test_generate_node_puts_content_lock_before_cultural_guidance(self):
+ from vulca.content_lock import extract_content_lock
+ from vulca.providers.base import ImageResult
+
+ class CapturingProvider:
+ def __init__(self):
+ self.prompts = []
+
+ async def generate(self, prompt, **kwargs):
+ self.prompts.append(prompt)
+ self.kwargs = kwargs
+ return ImageResult(
+ image_b64="iVBORw0KGgo=",
+ mime="image/png",
+ metadata={"candidate_id": "captured"},
+ )
+
+ intent = (
+ "Ink and wash painting of delicate bamboo and orchid grasses beside "
+ "vertical Chinese calligraphy and red seals on aged paper."
+ )
+ provider = CapturingProvider()
+ lock = extract_content_lock(intent)
+ output = asyncio.run(
+ execute(
+ FAST,
+ PipelineInput(
+ subject="track1_0002",
+ intent=intent,
+ tradition="chinese_xieyi",
+ provider="gemini",
+ image_provider=provider,
+ max_rounds=1,
+ node_params={"generate": {"content_lock": lock.to_dict()}},
+ ),
+ )
+ )
+
+ assert output.status == "completed"
+ prompt = provider.prompts[0]
+ assert prompt.index("NON-NEGOTIABLE CONTENT REQUIREMENTS") < prompt.index(intent)
+ assert "Do not replace these subjects with mountains" in prompt
+
+ def test_generate_node_does_not_send_sample_id_as_provider_subject_with_content_lock(self):
+ from vulca.content_lock import extract_content_lock
+ from vulca.providers.base import ImageResult
+
+ class CapturingProvider:
+ async def generate(self, prompt, **kwargs):
+ self.prompt = prompt
+ self.kwargs = kwargs
+ return ImageResult(
+ image_b64="iVBORw0KGgo=",
+ mime="image/png",
+ metadata={"candidate_id": "captured"},
+ )
+
+ intent = (
+ "Abstract hand-drawn branching lines fill a rectangular frame on graph "
+ "paper in monochrome pencil style."
+ )
+ provider = CapturingProvider()
+ lock = extract_content_lock(intent)
+ output = asyncio.run(
+ execute(
+ FAST,
+ PipelineInput(
+ subject="track1_0301",
+ intent=intent,
+ tradition="default",
+ provider="gemini",
+ image_provider=provider,
+ max_rounds=1,
+ node_params={"generate": {"content_lock": lock.to_dict()}},
+ ),
+ )
+ )
+
+ assert output.status == "completed"
+ assert provider.kwargs["subject"] == ""
+ assert "track1_0301" not in provider.prompt
+
+ def test_generate_node_puts_artifact_boundary_before_content_requirements(self):
+ from vulca.content_lock import extract_content_lock
+ from vulca.providers.base import ImageResult
+
+ class CapturingProvider:
+ async def generate(self, prompt, **kwargs):
+ self.prompt = prompt
+ return ImageResult(
+ image_b64="iVBORw0KGgo=",
+ mime="image/png",
+ metadata={"candidate_id": "captured"},
+ )
+
+ intent = "Socialist Realism propaganda poster with workers and red banners."
+ provider = CapturingProvider()
+ lock = extract_content_lock(intent)
+ output = asyncio.run(
+ execute(
+ FAST,
+ PipelineInput(
+ subject="track1_0151",
+ intent=intent,
+ tradition="default",
+ provider="gemini",
+ image_provider=provider,
+ max_rounds=1,
+ node_params={"generate": {"content_lock": lock.to_dict()}},
+ ),
+ )
+ )
+
+ assert output.status == "completed"
+ assert provider.prompt.index("ARTIFACT BOUNDARY REQUIREMENT") < provider.prompt.index(
+ "NON-NEGOTIABLE CONTENT REQUIREMENTS"
+ )
+ assert "flat, front-facing propaganda poster artwork" in provider.prompt
+
+ def test_generate_node_puts_relation_semantics_before_user_intent(self):
+ from vulca.content_lock import extract_content_lock
+ from vulca.providers.base import ImageResult
+
+ class CapturingProvider:
+ async def generate(self, prompt, **kwargs):
+ self.prompt = prompt
+ return ImageResult(
+ image_b64="iVBORw0KGgo=",
+ mime="image/png",
+ metadata={"candidate_id": "captured"},
+ )
+
+ intent = (
+ "Wartime illustration of mounted soldiers beside fleeing civilians, "
+ "burning village ruins, and aircraft overhead."
+ )
+ provider = CapturingProvider()
+ lock = extract_content_lock(intent)
+ output = asyncio.run(
+ execute(
+ FAST,
+ PipelineInput(
+ subject="track1_0747",
+ intent=intent,
+ tradition="default",
+ provider="gemini",
+ image_provider=provider,
+ max_rounds=1,
+ node_params={"generate": {"content_lock": lock.to_dict()}},
+ ),
+ )
+ )
+
+ assert output.status == "completed"
+ assert provider.prompt.index("RELATION SEMANTICS REQUIREMENTS") < provider.prompt.index(
+ "USER INTENT TO PRESERVE VERBATIM"
+ )
+ assert "soldiers chasing civilians" in provider.prompt
+
# ── EvaluateNode ────────────────────────────────────────────────────
diff --git a/tests/test_vlm_prompt.py b/tests/test_vlm_prompt.py
index d7b0e9f7..b655b53b 100644
--- a/tests/test_vlm_prompt.py
+++ b/tests/test_vlm_prompt.py
@@ -4,6 +4,7 @@
import pytest
from vulca._vlm import (
+ _CONTENT_LOCK_MAX_TOKENS,
_DEFAULT_MAX_TOKENS,
_ESCALATED_MAX_TOKENS,
_STATIC_SCORING_PREFIX,
@@ -236,3 +237,90 @@ def test_score_image_no_double_escalation(self):
# _MAX_ESCALATION_ATTEMPTS=1 means at most 2 calls total
assert mock_acompletion.call_count == 2
+
+ def test_score_image_content_lock_gets_final_large_budget_on_second_truncation(self):
+ truncated = _make_mock_response("length", "" + _VALID_SCORING_JSON + "")
+ full_resp = _make_mock_response("stop", "" + _VALID_SCORING_JSON + "")
+ mock_acompletion = AsyncMock(side_effect=[truncated, truncated, full_resp])
+
+ with patch("litellm.acompletion", mock_acompletion):
+ result = asyncio.run(
+ score_image(
+ img_b64="aGVsbG8=",
+ mime="image/png",
+ subject="test artwork",
+ tradition="chinese_xieyi",
+ api_key="test-key",
+ content_lock={"original_intent": "test artwork"},
+ )
+ )
+
+ assert mock_acompletion.call_count == 3
+ assert [call.kwargs["max_tokens"] for call in mock_acompletion.call_args_list] == [
+ _DEFAULT_MAX_TOKENS,
+ _ESCALATED_MAX_TOKENS,
+ _CONTENT_LOCK_MAX_TOKENS,
+ ]
+ assert result.get("L1") == pytest.approx(0.8)
+
+ def test_score_image_adds_blind_relation_gate_for_required_relations(self):
+ from vulca.content_lock import extract_content_lock
+
+ lock = extract_content_lock(
+ "Wartime illustration of mounted soldiers beside fleeing civilians, "
+ "burning village ruins, and aircraft overhead."
+ )
+ scoring_json = (
+ '{"L1": 0.9, "L1_rationale": "ok", "L1_suggestion": "try", '
+ '"L1_deviation_type": "traditional", "L1_observations": "", "L1_reference_technique": "", '
+ '"L2": 0.9, "L2_rationale": "ok", "L2_suggestion": "try", '
+ '"L2_deviation_type": "traditional", "L2_observations": "", "L2_reference_technique": "", '
+ '"L3": 0.9, "L3_rationale": "ok", "L3_suggestion": "try", '
+ '"L3_deviation_type": "traditional", "L3_observations": "", "L3_reference_technique": "", '
+ '"L4": 0.9, "L4_rationale": "ok", "L4_suggestion": "try", '
+ '"L4_deviation_type": "traditional", "L4_observations": "", "L4_reference_technique": "", '
+ '"L5": 0.9, "L5_rationale": "ok", "L5_suggestion": "try", '
+ '"L5_deviation_type": "traditional", "L5_observations": "", "L5_reference_technique": "", '
+ '"missing_required_subjects": [], '
+ '"missing_required_text_elements": [], '
+ '"missing_required_surface": [], '
+ '"missing_required_style_attributes": [], '
+ '"apparent_relations": ["caption-conditioned escort"], '
+ '"relation_semantics_failed": false, '
+ '"forbidden_readings_present": [], '
+ '"forbidden_visual_artifacts": [], '
+ '"unwanted_visible_text": false, '
+ '"output_is_artwork_itself": true, '
+ '"risk_flags": []}'
+ )
+ blind_json = (
+ '{"visible_entities": ["mounted soldiers", "civilians"], '
+ '"primary_reading": "Mounted soldiers appear to chase fleeing civilians.", '
+ '"apparent_relations": ["mounted soldiers chasing civilians"], '
+ '"threat_cues": [], '
+ '"safety_cues": [], '
+ '"ambiguous_readings": [], '
+ '"confidence": 0.82}'
+ )
+ normal_resp = _make_mock_response("stop", f"{scoring_json}")
+ blind_resp = _make_mock_response("stop", blind_json)
+ mock_acompletion = AsyncMock(side_effect=[normal_resp, blind_resp])
+
+ with patch("litellm.acompletion", mock_acompletion):
+ result = asyncio.run(
+ score_image(
+ img_b64="aGVsbG8=",
+ mime="image/png",
+ subject="track1_0747",
+ tradition="default",
+ api_key="test-key",
+ content_lock=lock.to_dict(),
+ )
+ )
+
+ assert mock_acompletion.call_count == 2
+ gate = result["content_fidelity_gate"]
+ assert gate["blind_relation_decision"] == "reject"
+ assert gate["blind_forbidden_readings_present"] == [
+ "soldiers chasing civilians"
+ ]