Skip to content

Fix/issue 79#82

Merged
askdba merged 4 commits intomainfrom
fix/issue-79
Mar 13, 2026
Merged

Fix/issue 79#82
askdba merged 4 commits intomainfrom
fix/issue-79

Conversation

@askdba
Copy link
Owner

@askdba askdba commented Mar 13, 2026

Pull Request

Description

Fixes #

Type of Change

  • 🐛 bugfix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to change)
  • 📚 Documentation update
  • 🔧 Maintenance (refactoring, dependencies, etc.)

Changes Made

  • TODO
  • TODO
  • TODO

Testing Done

  • Built plugin successfully on MySQL 8.0.x
  • Built plugin successfully on MySQL 9.0.x
  • Ran demo/stanford50d tests
  • Added new tests for new functionality
  • Tested with online binlog updates (if applicable)

Checklist

  • My code follows the project's style guidelines
  • I have updated the documentation (README.md, DEMO.md, code comments)
  • I have added an entry to CHANGELOG.md under [Unreleased]
  • My changes don't introduce new compiler warnings
  • I have tested my changes locally

Performance Impact

  • No performance impact
  • Performance improvement (describe):
  • Potential performance regression (justify):

Screenshots / Logs

Additional Notes


Note

Medium Risk
Changes myvector_construct UDF memory layout and execution path by caching converted constant inputs, which could affect correctness for edge-case inputs/options and increases sensitivity to buffer/length handling. Added CI benchmarking workflow/scripts are low risk but introduce Docker build/push steps on the fix/issue-79 branch.

Overview
Improves issue #79 performance by refactoring myvector_construct conversion into a helper and caching the converted result in myvector_construct_init when the first argument is constant, so nested myvector_construct(...) inside queries avoids per-row re-parsing.

Adds an issue #79 benchmark harness (scripts/benchmark-issue79.sh), baseline documentation, and a branch-gated GitHub Actions workflow to build a MySQL 8.4 plugin/Docker image, run the benchmark, upload results, and optionally push the image. Updates .gitignore for new build/cache directories.

Written by Cursor Bugbot for commit 2dd01ab. This will update automatically on new commits. Configure here.

askdba added 4 commits March 13, 2026 23:55
…tant

When myvector_construct('[...]') is used as a constant (e.g. nested in
myvector_distance), MySQL evaluates it per row. This change detects constant
args in _init, converts once, and returns the cached result in _construct.

- Add myvector_construct_do_convert() helper for shared conversion logic
- In _init: if args[0] is non-NULL, convert once and store in ptr
- In _construct: if cache present, return immediately (no per-row parsing)
- Handles string->float, i=float/o=float, and o=bv paths

Made-with: Cursor
- Builds Docker image with plugin from this branch (includes fix)
- Runs benchmark (500 rows, 768 dim) comparing nested vs literal
- Uploads results as artifact
- Pushes image to ghcr.io on push to fix/issue-79
- Does NOT run on main

Made-with: Cursor
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2dd01ab913

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

*(size_t*)buf = 0; // 0 = no cache

/* Issue #79: when first arg is constant, convert once and cache */
if (args->args[0] != nullptr && args->lengths[0] > 0) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Cache construct output only when options arg is constant

This cache is populated whenever the first argument is constant, but myvector_construct() returns the cached bytes before it inspects argument 2, so queries where arg1 is literal and arg2 is row-dependent (for example, varying o=float vs o=bv) will return the same precomputed value for every row. In that case the init-time conversion is done with whatever option is visible in init (often none), and per-row options are silently ignored, which produces incorrect serialized vectors.

Useful? React with 👍 / 👎.


- name: Prepare Docker build context
run: |
cp myvector-amd64.so .

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Remove self-copy of plugin .so in benchmark workflow

The previous step already writes myvector-amd64.so into the repository root, so cp myvector-amd64.so . copies the file onto itself; GNU cp reports this as an error ('file' and './file' are the same file) and exits non-zero. Because GitHub Actions runs run steps with bash -e, this aborts the job before Docker build and benchmark execution.

Useful? React with 👍 / 👎.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

&err) &&
!err) {
*(size_t*)buf = result_len;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache ignores non-constant second argument's per-row value

Medium Severity

The caching in myvector_construct_init only checks whether args->args[0] is constant but doesn't verify that args->args[1] (the options argument) is also constant when arg_count >= 2. If the first argument is a constant string but the second argument comes from a column or expression, the init caches the result using default options ("i=string,o=float"). The per-row myvector_construct function then unconditionally returns this cached result, silently ignoring the actual per-row options value. This produces incorrect conversions whenever the runtime options differ from the default.

Additional Locations (1)
Fix in Cursor Fix in Web

askdba added a commit that referenced this pull request Mar 13, 2026
- Cache only when ALL args are constant; skip cache if args[1] varies per row
  (Codex P1, Cursor Bugbot)
- Remove redundant cp myvector-amd64.so . in benchmark workflow
  (Codex P2 - file already at repo root from build step)

Made-with: Cursor
@askdba askdba merged commit 2dd01ab into main Mar 13, 2026
11 of 26 checks passed
@askdba askdba deleted the fix/issue-79 branch March 13, 2026 22:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant