Skip to content

feat(imagegen): support reference image for image-to-image (gpt-image-2)#14

Merged
1bcMax merged 2 commits intomainfrom
feat/imagegen-reference-image
Apr 25, 2026
Merged

feat(imagegen): support reference image for image-to-image (gpt-image-2)#14
1bcMax merged 2 commits intomainfrom
feat/imagegen-reference-image

Conversation

@1bcMax
Copy link
Copy Markdown
Contributor

@1bcMax 1bcMax commented Apr 25, 2026

Closes #12 (initial scope — OpenAI gpt-image-1/2). Gemini Nano Banana Pro and Grok Imagine Image Pro intentionally deferred until the gateway can route them; this PR is just the client side for the OpenAI edit endpoint that already works.

Summary

  • Adds `image_url` to the `ImageGen` tool schema, mirroring `VideoGen`. Accepts http(s) URLs, data URIs, or local file paths; local paths are read, base64-encoded, capped at 4 MB.
  • When `image_url` is set, the tool routes to `/v1/images/image2image` instead of `/v1/images/generations` and gates the model to the edit-capable set (`openai/gpt-image-1`, `openai/gpt-image-2`). Default for ref-image mode is `openai/gpt-image-2`.
  • Skips the AskUser proposal flow in ref-image mode — the media router doesn't yet know which models do image-to-image, and its suggestions would frequently be unusable.
  • Decodes `data:` URIs locally on response (gpt-image-2's edit endpoint returns base64 in `url`), saving a round-trip and dodging `fetch()` data-URI quirks.

Out of scope (deferred)

  • `google/nano-banana-pro` and `xai/grok-imagine-image-pro` need gateway `editImage()` branches that don't exist on `main` yet. The client `EDIT_SUPPORTED_MODELS` set comments mention this.
  • Mask / inpainting parameters (only OpenAI supports them anyway, and v1 reporter scope-limit suggested skipping).
  • Cost-aware media-router suggestions for ref-image mode.

Test plan

  • `npm run build` clean
  • `npm test` — 166/166 pass (+4 new cases: `resolveReferenceImage` URL/data/file pass-throughs, relative path resolution, oversize / bad-ext rejection, `EDIT_SUPPORTED_MODELS` gate)
  • Live: `franklin` → `/model sonnet` → `Read /path/to/reference.png` then `ImageGen({ prompt: '...style transfer...', image_url: '/path/to/reference.png' })` → image saved locally, retains reference style/character

1bcMax added 2 commits April 24, 2026 23:30
Closes #12 (initial scope — OpenAI gpt-image-1/2).

Adds image_url to the ImageGen tool schema + flow:
- Accepts http(s) URL, data URI, or local file path (auto-resolved,
  base64-encoded, capped at 4 MB).
- When image_url is set, routes the call to /v1/images/image2image
  instead of /v1/images/generations and forces an edit-capable model.
  Default for ref-image mode is openai/gpt-image-2.
- Skips the AskUser proposal flow in ref-image mode (the media router
  doesn't yet know which models do img-to-img).
- Decodes data: URIs locally instead of fetching, since gpt-image-2's
  edit endpoint can return base64.

google/nano-banana-pro and xai/grok-imagine-image-pro intentionally NOT
in the supported set yet — they need gateway-side editImage() branches
that are out of scope for this PR.

Tests: +4 cases for resolveReferenceImage (URL/data/file pass-throughs,
relative path, oversize cap, bad ext) and the EDIT_SUPPORTED_MODELS gate.
Total 166/166 pass; build clean.
Gateway's /v1/images/image2image schema validates `image` against
/^data:image\//, so passing a raw URL through (the original draft) hits
a 400 instead of reaching the upstream. Make resolveReferenceImage
async and fetch URLs into a base64 data URI client-side, with:

- 30s fetch timeout
- Content-Type must start with image/
- Same 4 MB cap as local files
- Clean error surfaces for non-2xx responses

Also flips the existing tests to await the now-async helper and adds a
URL-fetch test against an in-process http server (positive case +
non-image content-type rejection + 404 propagation).

167/167 tests pass.
@1bcMax 1bcMax merged commit 0a9de98 into main Apr 25, 2026
2 checks passed
@1bcMax 1bcMax deleted the feat/imagegen-reference-image branch April 25, 2026 03:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(imagegen): support reference image for image-to-image (GPT Image 2 / Nano Banana / Grok Imagine Pro)

1 participant