feat(imagegen): support reference image for image-to-image (gpt-image-2)#14
Merged
feat(imagegen): support reference image for image-to-image (gpt-image-2)#14
Conversation
Closes #12 (initial scope — OpenAI gpt-image-1/2). Adds image_url to the ImageGen tool schema + flow: - Accepts http(s) URL, data URI, or local file path (auto-resolved, base64-encoded, capped at 4 MB). - When image_url is set, routes the call to /v1/images/image2image instead of /v1/images/generations and forces an edit-capable model. Default for ref-image mode is openai/gpt-image-2. - Skips the AskUser proposal flow in ref-image mode (the media router doesn't yet know which models do img-to-img). - Decodes data: URIs locally instead of fetching, since gpt-image-2's edit endpoint can return base64. google/nano-banana-pro and xai/grok-imagine-image-pro intentionally NOT in the supported set yet — they need gateway-side editImage() branches that are out of scope for this PR. Tests: +4 cases for resolveReferenceImage (URL/data/file pass-throughs, relative path, oversize cap, bad ext) and the EDIT_SUPPORTED_MODELS gate. Total 166/166 pass; build clean.
Gateway's /v1/images/image2image schema validates `image` against /^data:image\//, so passing a raw URL through (the original draft) hits a 400 instead of reaching the upstream. Make resolveReferenceImage async and fetch URLs into a base64 data URI client-side, with: - 30s fetch timeout - Content-Type must start with image/ - Same 4 MB cap as local files - Clean error surfaces for non-2xx responses Also flips the existing tests to await the now-async helper and adds a URL-fetch test against an in-process http server (positive case + non-image content-type rejection + 404 propagation). 167/167 tests pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #12 (initial scope — OpenAI gpt-image-1/2). Gemini Nano Banana Pro and Grok Imagine Image Pro intentionally deferred until the gateway can route them; this PR is just the client side for the OpenAI edit endpoint that already works.
Summary
Out of scope (deferred)
Test plan