Component migration (8.4–9.6) and release workflow update#76
Component migration (8.4–9.6) and release workflow update#76
Conversation
…it rollback) - Add component build: myvector_component.cc, query rewriter, binlog service, UDF service - Binlog state: persist/load binlog_state.json (file, position, server UUID); refuse start on UUID mismatch - Init: UDF register then binlog start; rollback (deregister UDFs, release handle) on failure - CMake: require MYSQL_SOURCE_DIR; build-component.sh and pre-pr.sh - CI: build-component, test-component (INSTALL COMPONENT + UDF smoke), lint component_src - Release: build-and-release-component job; create-release needs component artifacts - Fix: fprintf format string in myvector_binlog_service.cc Co-authored-by: Cursor <cursoragent@cursor.com>
- Update Create Release step to use action-gh-release@v2 - No input or env changes required; v2 accepts same files and generate_release_notes Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a66f5fe832
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| src/myvectorutils.cc | ||
| ) | ||
|
|
||
| add_library(myvector_component SHARED ${MYVECTOR_COMPONENT_SRCS}) |
There was a problem hiding this comment.
Restore a
myvector plugin build target
This commit replaces the old plugin-oriented MYSQL_ADD_PLUGIN(myvector, ...) flow with only add_library(myvector_component ...), but the CI plugin job still copies this CMakeLists.txt into mysql-server/plugin/myvector and runs make myvector (.github/workflows/ci.yml, build job). Without a myvector target, that path cannot produce the plugin artifact, so plugin CI and downstream plugin packaging regress.
Useful? React with 👍 / 👎.
| return 0.0; | ||
| } | ||
|
|
||
| double distance = HammingDistanceFn(v1_raw, v2_raw, (void*)&dim1_bytes); // Pass dimension in bits, function expects bytes |
There was a problem hiding this comment.
Pass bit length to
HammingDistanceFn
HammingDistanceFn consumes a bit-count quantity (it computes loop count as qty / (sizeof(unsigned long) * 8) in src/myvector.cc), but this code passes dim1_bytes, which is already divided by 8. That makes the UDF process only one-eighth of each binary vector and return systematically incorrect hamming distances for valid inputs.
Useful? React with 👍 / 👎.
| v2 = (FP32*)(v2_raw); | ||
| #endif | ||
|
|
||
| double distance = computeL2Distance(v1, v2, dim1); |
There was a problem hiding this comment.
Honor the metric argument in
myvector_distance
The component registers myvector_distance with up to 3 arguments, but the implementation always computes L2 and ignores args[2]. Calls like myvector_distance(v1, v2, 'cosine') or 'ip' now silently return L2 values, which corrupts query semantics for users relying on non-L2 metrics.
Useful? React with 👍 / 👎.
…ce metric, fix Hamming bit length Co-authored-by: Cursor <cursoragent@cursor.com>
… for UDF Co-authored-by: Cursor <cursoragent@cursor.com>
…rsion.h, my_config.h) Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
…rintf, script cleanup - Install mysql_version.h stub into MySQL cache include/mysql/ so server headers (plugin.h) find it when building from tarball - Add noexcept to service base and derived destructors to fix exception specification override errors - Fix snprintf format: use ER_MYVECTOR_INDEX_NOT_FOUND with (%s) and col in UDF and plugin (removes -Wformat-extra-args) - Remove debug logging from build-component.sh - Include mysql_version.h, my_config.h, component_config.cc and other component migration changes Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
…warnings - Clarify build-component.sh version examples and demo path notes in CONTRIBUTING.md - Fix format specifier (%zu) and string literal warnings in myvector.cc - Component, binlog, RFC, and build script updates; remove generated Makefile Co-authored-by: Cursor <cursoragent@cursor.com>
…own validators - Add chown step after docker build so boost_cache is writable by cache action - Re-enable VALIDATE_GITHUB_ACTIONS and VALIDATE_MARKDOWN in linter - Add .markdownlint.yaml and .github/actionlint.yaml for targeted exceptions Co-authored-by: Cursor <cursoragent@cursor.com>
…o defaults apply Co-authored-by: Cursor <cursoragent@cursor.com>
- actionlint: ignore shellcheck SC2015 (A && B || C pattern) - markdownlint: add ---, disable MD030 for list marker spaces - disable SHELL_SHFMT (pre-existing script style) Co-authored-by: Cursor <cursoragent@cursor.com>
- Rewrite A && B || C to proper if/fi in ci.yml (fixes shellcheck SC2015) - Add MD040 false for fenced code without language in .markdownlint.yaml Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
- Trigger on workflow_run of MyVector CI or Release (remove pull_request) - Skip job when triggering workflow failed - Push images only when triggered by Release workflow Co-authored-by: Cursor <cursoragent@cursor.com>
…de and output - Run verification query into variable instead of pipeline - Explicitly check mysql_exit before testing output - Prevents false negatives when $MYSQL_CMD fails Co-authored-by: Cursor <cursoragent@cursor.com>
- Add libbinlogevents/include for MySQL 8.0 binlog headers - Use mysql::binlog::event namespace alias for 8.4+ (was binary_log in 8.0) - Include mysql/mysql.h and MYSQL_ERRMSG_SIZE fallback in UDF service - Generate version-specific mysql_version.h from mysql-tag (8.0.40->80040, etc.) so correct code paths are selected per MySQL version Co-authored-by: Cursor <cursoragent@cursor.com>
- Add <vector> to myvector.h (std::vector used in AbstractVectorIndex) - Add <mysql/plugin.h> before service_my_plugin_log in UDF service (MYSQL_PLUGIN) - Add <arpa/inet.h> for ntohs/htonl when using MySQL binlog headers on Unix Co-authored-by: Cursor <cursoragent@cursor.com>
Set myvector_feature_level=3 (bit 0 set) so start_binlog_monitoring returns early. The binlog thread requires config (host, user, socket) which is not available in CI. Basic UDF tests don't need binlog. Co-authored-by: Cursor <cursoragent@cursor.com>
…L 8.0) Co-authored-by: Cursor <cursoragent@cursor.com>
- Move BITS_PER_BYTE and SharedLockGuard to include/myvector.h for broader access and cleaner architecture - Add #include <vector> to myvector.h (missing include) - Improve SharedLockGuard: rename clear() to release(), add deleted copy/move, make constructor explicit - Fix: VectorIndexCollection::open() now acquires shared lock before returning (matching get() behavior), preventing UB when SharedLockGuard releases a lock never acquired on the open() path - Fix: myvector_ann_set_init moves lock guard creation after null check to avoid constructing guard with null index - Fix: improve ER_MYVECTOR_INDEX_NOT_FOUND message to include index name - Fix: myvector_display returns string literal instead of writing to UDF result buffer (fixes invalid memory access) - Fix: format string %d -> %zu for size_t colinfo.length() Co-authored-by: askdba <5901296+askdba@users.noreply.github.com>
…atibility The PR #76 introduces a macro to abstract UDF metadata service access, preparing for component builds where the service is accessed differently. This aligns with the component migration plan in docs/COMPONENT_MIGRATION_PLAN.md. Co-authored-by: askdba <5901296+askdba@users.noreply.github.com>
- SharedLockGuard: make release-only (no acquire in constructor) to avoid double-lock when used with get() which already acquires before returning - VectorIndexCollection::open(): acquire lockShared() before returning to match get() behavior; SharedLockGuard expects caller to hold lock - Keeps component build intact (PR #78 targeted main without component) Co-authored-by: Cursor <cursoragent@cursor.com>
…orkflow_run gating - build-component.sh: write mysql_version.h to project include/ so component build uses version-specific stub (project include/ is first in CMake path) - docker-publish.yml: use workflow_run.name instead of workflow_run.path for gating Docker push (path may be undocumented) Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
…ewrite) Made-with: Cursor
Made-with: Cursor
…atibility Made-with: Cursor
| myvector.cc | ||
| myvectorutils.cc | ||
| ) | ||
| MYSQL_ADD_PLUGIN(myvector ${MYVECTOR_PLUGIN_SRCS}) |
There was a problem hiding this comment.
Missing MODULE_ONLY breaks in-tree plugin shared library build
High Severity
The MYSQL_ADD_PLUGIN(myvector ${MYVECTOR_PLUGIN_SRCS}) call is missing MODULE_ONLY, MODULE_OUTPUT_NAME "myvector", and LINK_LIBRARIES ${LIBMYSQLCLIENT} that were present in the old CMakeLists.txt. Without MODULE_ONLY, MySQL's CMake defaults to building the plugin as a static library rather than a shared .so module. This means myvector.so is never produced, breaking the CI build job, test job, and Docker publish workflow which all expect plugin_output_directory/myvector.so to exist.
Additional Locations (1)
… dnf in mysql image) Made-with: Cursor
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b2bac5500d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (!vec || vecsz == 0) { | ||
| continue; /* skip row: no vector data */ | ||
| } |
There was a problem hiding this comment.
Exit row parse loop when vector payload is missing
When a row has no usable vector payload, this branch continues before the loop’s only index >= event_len termination check, so the next iteration can read event_buf[index] past the event buffer. In practice, inserts with NULL/empty vector values can make the parser spin on invalid memory or crash instead of safely skipping the row.
Useful? React with 👍 / 👎.
| // error_print("Binlog fetch failed: %s", err.c_str()); | ||
| // TODO: Replace with component-specific logging | ||
| break; | ||
| } |
There was a problem hiding this comment.
Signal worker shutdown before breaking binlog fetch loop
On non-timeout mysql_binlog_fetch errors, this break exits the fetch loop without notifying the queue/worker threads to stop. The function then joins worker_threads_, but those threads are blocked in gqueue_.dequeue() waiting for items, so the binlog thread can block indefinitely and incremental index updates stop until an external stop path forces shutdown.
Useful? React with 👍 / 👎.
Made-with: Cursor
| --suppress=missingIncludeSystem \ | ||
| --error-exitcode=0 \ | ||
| -I include -I src/component_src \ | ||
| src/component_src/ |
There was a problem hiding this comment.
Incorrect YAML indentation breaks cppcheck run block
High Severity
The second cppcheck invocation is indented at 8 spaces, matching the run: key level, while the first cppcheck inside the run: | block is at 10 spaces. In a YAML literal block scalar, content indentation is set by the first line (10 spaces). Lines at 8 spaces fall outside the block, causing a YAML parse error that will break the entire lint job.
Made-with: Cursor
Made-with: Cursor
- CI: add DROP for all UDFs in plugin reinstall test - CI: allow build-component-9-6 to fail (dnf flaky) - Charset metadata for component UDFs (ann_set, display) - NULL check for myvector_construct first arg - Stub comments in mysql_version.h, my_config.h - Failure modes docs for binlog state - MYSQL handle leak fix on connect retry - request_shutdown before worker join on binlog error - escape_identifier buffer overflow fix - Release: use build-component-9.6-docker.sh for 9.6 - Update PR76_STATUS with completion log Made-with: Cursor
43a5e42 to
d2d5f33
Compare
| -DWITH_NDBCLUSTER=OFF \ | ||
| -DWITH_EXAMPLE_STORAGE_ENGINE=OFF \ | ||
| -DCMAKE_BUILD_TYPE=Release \ | ||
| >/dev/null 2>&1 |
There was a problem hiding this comment.
Missing -DWITH_BOOST causes MySQL 9.6 cmake to fail
High Severity
The cmake invocation inside the docker build script specifies -DDOWNLOAD_BOOST=1 but omits the required -DWITH_BOOST=<path> flag. MySQL's build system requires both flags together — without -DWITH_BOOST, cmake will fail with an error asking for the download destination. The error output is suppressed by >/dev/null 2>&1, making diagnosis difficult. Every other cmake invocation in the codebase (CI build-component job, sandbox-component-test.sh) correctly includes both flags. The release workflow also calls this script for 9.6.0 builds, meaning 9.6 component releases will fail.
Additional Locations (1)
| delete item; | ||
| } | ||
| } | ||
| }); |
There was a problem hiding this comment.
Worker threads busy-spin when binlog loop exits on error
Medium Severity
When the binlog fetch loop exits on a non-timeout error, gqueue_.request_shutdown() is called but shutdown_binlog_thread_ is never set to true. Worker threads loop on !shutdown_binlog_thread_.load(), and since the queue's shutting_down_ flag makes dequeue() return nullptr immediately without blocking, workers enter a tight CPU-burning busy-spin. The subsequent join() calls in binlog_loop_fn will block until stop_binlog_monitoring() is eventually called from component deinit, which finally sets shutdown_binlog_thread_.
Additional Locations (1)
- Add myvector_debug_construct_calls() UDF and counter in myvector_construct() to verify per-row invocation when nested in myvector_distance() - Add mysql-test udf_construct_per_row.test (requires plugin with debug UDF) - Add scripts/test-docker-image.sh: fast test against prebuilt ghcr.io/askdba/myvector:mysql8.4 - Add scripts/run-mtr-sandbox.sh: build MySQL+plugin in Docker and run mtr (for full verification) Root cause: MySQL evaluates nested UDF args per row via get_arguments(). Fix (not yet implemented): cache result in init when args->args[0] is constant. Made-with: Cursor
Made-with: Cursor
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 5 total unresolved issues (including 4 from previous reviews).
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
| } | ||
| secure_zero_string(conn_password); | ||
| secure_zero_string(g_conn_config.password); | ||
| break; /// connected |
There was a problem hiding this comment.
Double mysql_close in binlog connect retry loop
High Severity
In binlog_loop_fn, when mysql_real_connect fails, mysql_close(&mysql) is called at line 1383. When the loop continues back to the top, since connect_attempts > 0, mysql_close(&mysql) is called again at line 1357 before mysql_init re-initializes the handle. This double-close of the MYSQL handle is undefined behavior and can cause crashes or memory corruption. The close at line 1357 should be removed or the close at line 1383 should be skipped so only one close happens per failed attempt.


Summary
softprops/action-gh-releasefrom v1 to v2 in.github/workflows/release.yml(no input/env changes required).Release action change
softprops/action-gh-release@v2.files,generate_release_notes; v2 is backward compatible for this usage.Made with Cursor
Note
High Risk
High risk due to large, cross-cutting changes spanning CMake/build mode logic, MySQL component runtime code (binlog/UDF lifecycle), and CI/release/Docker publishing workflows where failures can break builds or ship incorrect artifacts.
Overview
Introduces a standalone MySQL Component build mode alongside the existing in-tree plugin build:
CMakeLists.txtnow conditionally buildsmyvectorviaMYSQL_ADD_PLUGINor builds/installsmyvector_component(plusmyvector.json) usingMYSQL_SOURCE_DIR/MYSQL_BUILD_DIRandlibmysqlclientdiscovery.Expands CI and release automation to build/test component artifacts across MySQL 8.0/8.4/9.0 and adds special handling for MySQL 9.6 via a Docker-based build/test path; CI now also exercises uninstall/reinstall flows for both component and plugin. Docker publishing is reworked to trigger on successful
workflow_run, cache Boost, and only push images when invoked from the release workflow.Adds/updates build scripts, ignores, and docs to match the new workflow (CMake-only build, component build instructions, lint rule overrides), and includes component implementation sources/manifests plus reliability tweaks such as atomic file persistence improvements and additional safety checks in persistence/parsing paths.
Written by Cursor Bugbot for commit fa3553c. This will update automatically on new commits. Configure here.