Skip to content

Feature: Daemon Mode for Incremental File Watching and Automatic Reindexing #22

@marco0560

Description

@marco0560

Feature: Daemon Mode for Incremental File Watching and Automatic Reindexing

Status

Summary

Introduce an optional daemon mode for Codira that:

  • monitors repository files for changes
  • performs incremental reindexing automatically
  • optionally updates embeddings (if vector backend is enabled)

This mode is intended as a developer productivity feature, not part of the deterministic core.

Motivation

Current workflow:

edit files → manually run `codira index`

This introduces friction:

  • stale indexes between edits
  • repeated manual commands
  • suboptimal feedback loop

Daemon mode enables:

edit files → automatic reindex → immediate query freshness

Key Principle

Daemon mode MUST NOT compromise Codira’s deterministic guarantees.

Scope

Included

  • filesystem monitoring (watch mode)
  • incremental indexing on change
  • batching / debouncing
  • optional embedding updates
  • status inspection

Excluded

  • mandatory background operation
  • replacing explicit indexing
  • introducing heuristic behavior

Architectural Position

Mode A — CLI (authoritative)
    codira index
    codira ctx

Mode B — Daemon (optional)
    codira daemon

Hard Constraints

C1 — Optionality

Codira MUST function fully without daemon mode.

C2 — Deterministic isolation

  • CLI commands MUST NOT depend on daemon state
  • daemon updates MUST be observable and reproducible

C3 — Explicit state visibility

Users MUST be able to inspect:

  • last indexed state
  • pending changes
  • daemon activity

C4 — No hidden mutation

All daemon actions MUST be:

  • logged
  • traceable
  • reconstructable

High-Level Design

File watching pipeline

file change
   ↓
event queue
   ↓
debounce / batch
   ↓
incremental index
   ↓
(update deterministic backend)
   ↓
(optional) update embeddings (#21)

Backend interaction

Relationship with Vector Backend (#21)

Daemon mode is a natural companion to #21:

  • embeddings are expensive to compute
  • background updates improve usability
  • async pipeline becomes meaningful

However:

daemon mode MUST NOT require a vector backend

Relationship with Docker

  • daemon alone → low Docker value
  • daemon + vector backend → high Docker value (multi-service setup)

Open Design Questions

1. Watch scope

  • entire repository?
  • only indexed paths?
  • configurable include/exclude?

2. Event model

  • file-level vs directory-level events?
  • how to handle:
    • renames
    • deletes
    • mass changes (git checkout)?

3. Debounce strategy

  • fixed delay?
  • adaptive batching?
  • max batch size?

4. Consistency model

  • what happens if files change during indexing?
  • snapshot-at-start vs rolling update?

5. Identity and correctness

  • how to ensure index consistency with:
    • file hashes
    • analyzer versions
  • how to detect drift?

6. Embedding pipeline (#21)

  • synchronous vs asynchronous?
  • queue-based?
  • failure handling?

7. Status and observability

Define:

codira status

Must include:

  • last indexed revision / hash
  • pending changes
  • daemon running state
  • embedding sync status (if applicable)

8. Failure handling

  • partial indexing failures
  • backend errors
  • watcher failures

Policy:

  • fail-fast vs retry?
  • degradation behavior?

9. CLI surface

Proposed commands:

codira daemon start
codira daemon stop
codira daemon status

Open questions:

  • foreground vs background mode?
  • PID management?
  • integration with system services?

10. Logging

  • structured logs required
  • reproducible trace of actions
  • integration with --explain?

11. Cross-platform behavior

  • Linux (inotify)
  • macOS (FSEvents)
  • Windows (ReadDirectoryChangesW)

Questions:

  • abstraction layer?
  • fallback polling?

12. Interaction with manual indexing

  • what happens when:
    • user runs codira index while daemon is active?
  • locking strategy?
  • override semantics?

13. Performance limits

  • large repos
  • rapid change bursts
  • editor save patterns

Risks

R1 — Hidden state

Users may not know what is indexed.

R2 — Non-reproducibility

Time-based updates break deterministic workflows.

R3 — Race conditions

Concurrent changes during indexing.

R4 — Debug complexity

Harder to reproduce bugs compared to explicit CLI runs.

Mitigations

  • explicit status command
  • strict logging
  • deterministic identity model
  • optional disablement
  • manual override (codira index --force)

Suggested Implementation Plan

Phase 1 — Design

  • define watcher abstraction
  • define consistency model
  • define CLI surface

Phase 2 — Minimal implementation

  • basic file watching
  • debounced incremental indexing
  • no embedding integration

Phase 3 — Observability

  • status command
  • logging
  • diagnostics

Phase 4 — Integration with #21

  • async embedding updates
  • queue-based pipeline

Phase 5 — Hardening

  • cross-platform validation
  • stress testing
  • failure scenarios

Acceptance Criteria

  • daemon mode is fully optional
  • CLI mode remains deterministic and unchanged
  • file changes trigger correct incremental indexing
  • system remains consistent under rapid changes
  • status and logs provide full visibility
  • daemon can be stopped/restarted without corruption
  • works without vector backend
  • integrates cleanly with vector backend (Design richer symbol modeling for overload metadata and named declarations #21)

Key Principle

Daemon mode is a usability layer, not a correctness mechanism.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions