Skip to content

feat(migration): add config, state models, and lifecycle infrastructure#1823

Open
A0nameless0man wants to merge 3 commits intovolcengine:mainfrom
A0nameless0man:pr/migration-config-state
Open

feat(migration): add config, state models, and lifecycle infrastructure#1823
A0nameless0man wants to merge 3 commits intovolcengine:mainfrom
A0nameless0man:pr/migration-config-state

Conversation

@A0nameless0man
Copy link
Copy Markdown
Contributor

Description

Adds the foundation infrastructure for the blue-green embedding model migration framework: state models, persistence layer, and multi-embedder configuration support. Phase 0 of the migration design.

Related Issue

Type of Change

  • New feature (non-breaking change that adds functionality)

Changes Made

  • openviking/storage/migration/__init__.py — Module public API exports
  • openviking/storage/migration/state.py — MigrationPhase enum, ActiveSide enum, MigrationState dataclass, MigrationStateManager (atomic FileLock persistence), MigrationStateFile (permanent migration history)
  • openviking_cli/utils/config/open_viking_config.pyembeddings: Dict[str, EmbeddingConfig] field with @model_validator for active config resolution, get_target_embedder(), and dimension mismatch hard-block (C-9)
  • openviking_cli/utils/config/embedding_config.py — Removed stale get_target_embedder()
  • tests/migration/conftest.py — Test isolation fixtures
  • tests/migration/test_state.py — Serialization, persistence, concurrency, and state file tests
  • tests/migration/test_embedding_config_migration.py — Multi-embedder config parsing, active config resolution, backward compatibility tests
  • tests/migration/test_init.py — Module import tests

Testing

  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have tested this on the following platforms:
    • Linux
    • macOS
    • Windows

Checklist

  • My code follows the project's coding style
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Additional Notes

  • When embeddings is empty, behavior is fully backward compatible
  • Migration module is not wired into server yet — no online impact
  • Dimension mismatch (C-9) hard-blocks server startup to prevent silent query failures

- Add MigrationPhase enum, MigrationState dataclass with serialization
- MigrationStateManager (atomic JSON + FileLock persistence)
- MigrationStateFile for permanent migration state tracking
- EmbeddingConfig.get_target_embedder() for migration target resolution
- OpenVikingConfig.embeddings field with @model_validator
- Auto-create state file with embeddings["default"] when missing
- Dimension mismatch hard-blocks startup (C-9)
- 23 tests: state serialization, config migration, backward compat
@github-actions
Copy link
Copy Markdown

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🏅 Score: 85
🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review

Type safety: active_side should use ActiveSide enum

MigrationState.active_side is declared as str but the ActiveSide enum exists. Using the enum prevents invalid values and improves type checking.

active_side: str  # "source" or "target"
Race condition in MigrationStateFile read-modify-write operations

update_current_active and append_history read the file without holding the lock, then write with the lock. This allows lost updates if two processes modify the file concurrently.

def update_current_active(self, name: str) -> None:
    """Atomically update current_active, preserving other fields."""
    data = self._read_raw()
    data["current_active"] = name
    self._write_atomic(data)

def append_history(self, entry: Dict[str, Any]) -> None:
    """Atomically append a migration history record."""
    data = self._read_raw()
    data["history"].append(entry)
    self._write_atomic(data)
Non-atomic initial state file write

The model_validator writes the initial state file using write_text, which is not atomic. Use MigrationStateFile.create_initial for atomic writes to avoid corruption.

config_dir.mkdir(parents=True, exist_ok=True)
initial_state = {
    "version": 1,
    "current_active": current_active,
    "history": [],
}
state_file_path.write_text(
    json.dumps(initial_state, indent=2, ensure_ascii=False),
    encoding="utf-8",
)
Missing return type annotation for get_target_embedder

The method get_target_embedder lacks a return type annotation, reducing type safety and clarity.

def get_target_embedder(self, target_name: str):
    """Create a target embedder from a named embedding configuration.

    Looks up *target_name* in ``self.embeddings`` and returns a fully
    constructed embedder instance for that named config.  Used by the
    migration controller and crash-recovery path to create the embedder
    for the *target* side of a migration.

    Args:
        target_name: Key into ``self.embeddings`` (e.g. ``"v2"``).

    Returns:
        Embedder instance (Dense, Sparse, Hybrid, or Composite).

    Raises:
        KeyError: If *target_name* is not found in ``self.embeddings``.
    """
    target_cfg = self.embeddings[target_name]
    return target_cfg.get_embedder()

@github-actions
Copy link
Copy Markdown

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Use MigrationStateFile for state management

Use the existing MigrationStateFile class instead of direct file I/O for
reading/writing the migration state file. This ensures atomic writes and consistent
state management across the codebase.

openviking_cli/utils/config/open_viking_config.py [209-240]

-state_file_path = config_dir / "embedding_migration_state.json"
+from openviking.storage.migration.state import MigrationStateFile
 
-if state_file_path.exists():
-    try:
-        state = json.loads(state_file_path.read_text(encoding="utf-8"))
-        current_active = state["current_active"]
-    except (json.JSONDecodeError, KeyError) as e:
-        raise ValueError(
-            f"Embedding migration state file at {state_file_path} is corrupt: {e}"
-        ) from e
-else:
+state_file = MigrationStateFile(str(config_dir))
+try:
+    state = state_file.read()
+    current_active = state["current_active"]
+except FileNotFoundError:
     # No state file — must have "default" key in embeddings.
     # User's original embedding config becomes embeddings["default"].
     # If no "default" key, error out — the user must create it.
     if "default" not in self.embeddings:
         raise ValueError(
             "embeddings configured but no 'default' key found and no "
             "embedding_migration_state.json exists. "
             "Add 'default' to embeddings pointing to the current embedding config."
         )
     current_active = "default"
     # Auto-create state file
     config_dir.mkdir(parents=True, exist_ok=True)
-    initial_state = {
-        "version": 1,
-        "current_active": current_active,
-        "history": [],
-    }
-    state_file_path.write_text(
-        json.dumps(initial_state, indent=2, ensure_ascii=False),
-        encoding="utf-8",
-    )
+    state_file.create_initial(current_active)
+except (json.JSONDecodeError, KeyError) as e:
+    raise ValueError(
+        f"Embedding migration state file at {state_file.file_path} is corrupt: {e}"
+    ) from e
Suggestion importance[1-10]: 7

__

Why: Replacing direct file I/O with MigrationStateFile ensures consistent state management and atomic writes, leveraging the existing dedicated class for this purpose.

Medium
Use ActiveSide enum for type safety

Use the ActiveSide enum instead of a raw str for MigrationState.active_side to
improve type safety and enforce valid values. Update serialization/deserialization
methods to handle the enum correctly.

openviking/storage/migration/state.py [80-118]

 @dataclass
 class MigrationState:
     """Runtime migration state — persisted to migration_runtime_state.json.
 
     This is a transient file that is cleaned up once migration completes.
     For permanent migration history, see MigrationStateFile.
     """
 
     migration_id: str
     phase: MigrationPhase
     source_collection: str
     target_collection: str
-    active_side: str  # "source" or "target"
+    active_side: ActiveSide
     dual_write_enabled: bool
     source_embedder_name: str
     target_embedder_name: str
     degraded_write_failures: int = 0
     reindex_progress: Optional[ReindexProgress] = None
     started_at: str = ""
     updated_at: str = ""
 
     def to_dict(self) -> Dict[str, Any]:
         """Serialize to a JSON-compatible dictionary."""
         d = asdict(self)
         d["phase"] = self.phase.value
+        d["active_side"] = self.active_side.value
         if self.reindex_progress is not None:
             d["reindex_progress"] = self.reindex_progress.to_dict()
         return d
 
     @classmethod
     def from_dict(cls, data: Dict[str, Any]) -> "MigrationState":
         """Deserialize from a JSON-compatible dictionary."""
         data = dict(data)
         data["phase"] = MigrationPhase(data["phase"])
+        data["active_side"] = ActiveSide(data["active_side"])
         if data.get("reindex_progress") is not None:
             data["reindex_progress"] = ReindexProgress.from_dict(
                 data["reindex_progress"]
             )
         return cls(**data)
Suggestion importance[1-10]: 6

__

Why: Using the ActiveSide enum instead of a raw string for MigrationState.active_side improves type safety and enforces valid values, while updating the serialization/deserialization methods maintains JSON compatibility.

Low

- Use ActiveSide enum for MigrationState.active_side type safety
- Fix race condition: hold FileLock for entire read-modify-write
  cycle in update_current_active and append_history
- Use atomic MigrationStateFile.create_initial() instead of
  non-atomic write_text for initial state file creation
- Add return type annotation to get_target_embedder
@A0nameless0man A0nameless0man force-pushed the pr/migration-config-state branch from c4faba9 to be14b27 Compare April 30, 2026 07:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

1 participant