Add embeddings calibration script (hardware-aware auto-tuning)
Status
Summary
Introduce a calibration script that runs on a target machine (CPU/GPU) and determines optimal embeddings parameters, generating configuration values compatible with the configuration system introduced in #17.
Motivation
Embeddings performance depends heavily on:
- CPU core count
- memory bandwidth
- GPU availability and VRAM
- model characteristics
Static defaults are suboptimal across machines.
A calibration step allows:
- reproducible performance tuning
- optimal hardware utilization
- reduced manual configuration effort
Goal
Provide a deterministic, reproducible calibration tool that:
- benchmarks embedding execution on the current machine
- selects optimal parameters
- outputs configuration compatible with Codira config files
Scope
Calibration targets
- device selection (cpu / gpu / auto)
- thread count
- batch size
- GPU memory usage limits
Proposed Interface
CLI
codira calibrate embeddings
Output modes
codira calibrate embeddings --print
codira calibrate embeddings --write
codira calibrate embeddings --output <path>
--print → stdout (TOML snippet)
--write → writes to user config
--output → writes to specified file
Output Format
Example:
[embeddings]
enabled = true
device = "gpu"
threads = 8
batch_size = 64
[embeddings.gpu]
device_id = 0
memory_limit_mb = 6144
Calibration Method
Deterministic benchmarking
- fixed input dataset (bundled or generated deterministically)
- fixed number of iterations
- warm-up phase
- measure:
- throughput (texts/sec)
- latency
- memory usage
Parameter search space
- threads:
{1, 2, 4, 8, auto}
- batch_size:
{8, 16, 32, 64, 128}
- device:
Selection criteria
- maximize throughput
- respect memory limits
- avoid instability (OOM, timeouts)
Design Constraints
- deterministic results for identical hardware
- no network dependency
- no external services
- reproducible across runs
- bounded execution time
Hardware Detection
- CPU:
- GPU:
- availability
- device id
- VRAM (if accessible)
Safety Mechanisms
- detect OOM and discard configuration
- fallback to safe defaults
- limit total calibration duration
Integration with #17
- output must match config schema
- compatible with:
- no direct mutation unless
--write is used
Non-goals
- dynamic runtime adaptation
- continuous auto-tuning
- model selection optimization
- distributed benchmarking
Acceptance Criteria
- calibration command runs successfully on CPU-only systems
- calibration command runs on GPU-enabled systems
- outputs valid TOML config
- results improve performance vs defaults
- results are reproducible on same hardware
- no crashes under constrained environments
Implementation Notes
- isolate calibration logic in dedicated module
- reuse embedding pipeline
- avoid impacting normal runtime paths
- ensure compatibility with future embedding providers
Dependencies
Notes
This feature enables:
- portable performance tuning
- simplified onboarding on new machines
- better utilization of heterogeneous environments
Add embeddings calibration script (hardware-aware auto-tuning)
Status
Summary
Introduce a calibration script that runs on a target machine (CPU/GPU) and determines optimal embeddings parameters, generating configuration values compatible with the configuration system introduced in #17.
Motivation
Embeddings performance depends heavily on:
Static defaults are suboptimal across machines.
A calibration step allows:
Goal
Provide a deterministic, reproducible calibration tool that:
Scope
Calibration targets
Proposed Interface
CLI
Output modes
--print→ stdout (TOML snippet)--write→ writes to user config--output→ writes to specified fileOutput Format
Example:
Calibration Method
Deterministic benchmarking
Parameter search space
{1, 2, 4, 8, auto}{8, 16, 32, 64, 128}Selection criteria
Design Constraints
Hardware Detection
Safety Mechanisms
Integration with #17
--writeis usedNon-goals
Acceptance Criteria
Implementation Notes
Dependencies
Notes
This feature enables: