feat(semantic): OpenAI-compatible remote embedding adapter

Driven by discussion #310 — local Ollama on a CPU-only VM is too slow for first-time indexing, and pointing `OLLAMA_URL` at a remote box still requires running Ollama somewhere. A real third-party endpoint is the next step.

## Why

Current state (`src/index/semantic/embedding.ts`):

- Hardcoded to Ollama's `/api/embeddings` shape.
- `OLLAMA_URL` env var supports a *remote Ollama daemon* but not non-Ollama providers.
- `probeOllama` similarly Ollama-specific.

Gemini, OpenAI, vLLM, llama.cpp's server, LM Studio, and most third-party providers all expose `POST /v1/embeddings` with the same request/response shape. One adapter covers them all.

## Scope

**One adapter, OpenAI-compatible.** No per-vendor code paths.

- New env vars (or config keys):
  - `REASONIX_EMBED_PROVIDER` — `ollama` (default) | `openai-compat`
  - `REASONIX_EMBED_BASE_URL` — full base URL when provider is `openai-compat` (e.g. `https://generativelanguage.googleapis.com/v1beta/openai`, `https://api.openai.com/v1`, `http://localhost:8000/v1`)
  - `REASONIX_EMBED_API_KEY` — bearer token, sent as `Authorization: Bearer <key>`
  - `REASONIX_EMBED_MODEL` already exists — reused as the model name passed to the API
- Ollama remains the default. Zero behavior change for existing users.
- Adapter dispatch hidden behind the existing `embed()` / `embedAll()` / `probe*()` surface; callers see one API.

## Constraints

- **Vector dimensions per store.** Different models output different dims (Ollama `nomic-embed-text` = 768, OpenAI `text-embedding-3-small` = 1536, Gemini `text-embedding-004` = 768). The vector store must refuse mixing embeddings from different model identities — index header records `{ provider, model, dim }` and rejects appends that don't match. Caller can rebuild the index when switching providers.
- **No "free tier" framing in docs.** Document that an OpenAI-compatible endpoint *can* be Gemini / OpenAI / etc., without promising free. Free tiers move on someone else's schedule.
- **Errors stay actionable.** ECONNREFUSED keeps Ollama install hint when provider is Ollama; OpenAI-compat path needs its own "check your API key / base URL" hint on 401 / 404.

## Non-goals

- Per-vendor adapters (Gemini-specific, OpenAI-specific). One OpenAI-compat shape, period.
- Auto-migrating an existing Ollama-built index to a new provider. User rebuilds.
- Embedding-cache schema changes. Cache key already includes the model name; extending to include provider is straightforward but separable.

## Acceptance

- `REASONIX_EMBED_PROVIDER=openai-compat` + `REASONIX_EMBED_BASE_URL` + `REASONIX_EMBED_API_KEY` produces a working embedder that can build an index via `reasonix semantic build`.
- Default behavior (no env vars set) is byte-identical to today.
- Vector store rejects appending vectors from a different `{ provider, model }` than the index was built with.
- Test fixture: a fake OpenAI-compat endpoint (vitest `vi.fn`-driven fetch) drives the adapter through one happy-path embed and one 401 error.
- README / docs include one short paragraph on the env vars + a single example pointing at `api.openai.com/v1` (no "free" framing).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(semantic): OpenAI-compatible remote embedding adapter #324

Why

Scope

Constraints

Non-goals

Acceptance

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

feat(semantic): OpenAI-compatible remote embedding adapter #324

Description

Why

Scope

Constraints

Non-goals

Acceptance

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions