Conversation
Made-with: Cursor
…tant
When myvector_construct('[...]') is used as a constant (e.g. nested in
myvector_distance), MySQL evaluates it per row. This change detects constant
args in _init, converts once, and returns the cached result in _construct.
- Add myvector_construct_do_convert() helper for shared conversion logic
- In _init: if args[0] is non-NULL, convert once and store in ptr
- In _construct: if cache present, return immediately (no per-row parsing)
- Handles string->float, i=float/o=float, and o=bv paths
Made-with: Cursor
- Builds Docker image with plugin from this branch (includes fix) - Runs benchmark (500 rows, 768 dim) comparing nested vs literal - Uploads results as artifact - Pushes image to ghcr.io on push to fix/issue-79 - Does NOT run on main Made-with: Cursor
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2dd01ab913
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| *(size_t*)buf = 0; // 0 = no cache | ||
|
|
||
| /* Issue #79: when first arg is constant, convert once and cache */ | ||
| if (args->args[0] != nullptr && args->lengths[0] > 0) { |
There was a problem hiding this comment.
Cache construct output only when options arg is constant
This cache is populated whenever the first argument is constant, but myvector_construct() returns the cached bytes before it inspects argument 2, so queries where arg1 is literal and arg2 is row-dependent (for example, varying o=float vs o=bv) will return the same precomputed value for every row. In that case the init-time conversion is done with whatever option is visible in init (often none), and per-row options are silently ignored, which produces incorrect serialized vectors.
Useful? React with 👍 / 👎.
|
|
||
| - name: Prepare Docker build context | ||
| run: | | ||
| cp myvector-amd64.so . |
There was a problem hiding this comment.
Remove self-copy of plugin .so in benchmark workflow
The previous step already writes myvector-amd64.so into the repository root, so cp myvector-amd64.so . copies the file onto itself; GNU cp reports this as an error ('file' and './file' are the same file) and exits non-zero. Because GitHub Actions runs run steps with bash -e, this aborts the job before Docker build and benchmark execution.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
| &err) && | ||
| !err) { | ||
| *(size_t*)buf = result_len; | ||
| } |
There was a problem hiding this comment.
Cache ignores non-constant second argument's per-row value
Medium Severity
The caching in myvector_construct_init only checks whether args->args[0] is constant but doesn't verify that args->args[1] (the options argument) is also constant when arg_count >= 2. If the first argument is a constant string but the second argument comes from a column or expression, the init caches the result using default options ("i=string,o=float"). The per-row myvector_construct function then unconditionally returns this cached result, silently ignoring the actual per-row options value. This produces incorrect conversions whenever the runtime options differ from the default.
Additional Locations (1)
- Cache only when ALL args are constant; skip cache if args[1] varies per row (Codex P1, Cursor Bugbot) - Remove redundant cp myvector-amd64.so . in benchmark workflow (Codex P2 - file already at repo root from build step) Made-with: Cursor


Pull Request
Description
Fixes #
Type of Change
Changes Made
Testing Done
Checklist
[Unreleased]Performance Impact
Screenshots / Logs
Additional Notes
Note
Medium Risk
Changes
myvector_constructUDF memory layout and execution path by caching converted constant inputs, which could affect correctness for edge-case inputs/options and increases sensitivity to buffer/length handling. Added CI benchmarking workflow/scripts are low risk but introduce Docker build/push steps on thefix/issue-79branch.Overview
Improves issue #79 performance by refactoring
myvector_constructconversion into a helper and caching the converted result inmyvector_construct_initwhen the first argument is constant, so nestedmyvector_construct(...)inside queries avoids per-row re-parsing.Adds an
issue #79benchmark harness (scripts/benchmark-issue79.sh), baseline documentation, and a branch-gated GitHub Actions workflow to build a MySQL 8.4 plugin/Docker image, run the benchmark, upload results, and optionally push the image. Updates.gitignorefor new build/cache directories.Written by Cursor Bugbot for commit 2dd01ab. This will update automatically on new commits. Configure here.