Archive System: External Data Source Indexing#3047
Draft
Conversation
…ditions Reverts the query/response approach from #3037 and fixes the actual bugs that caused empty ephemeral directories: - directory_listing.rs: Restore async indexer dispatch (return empty, populate via events). Subdirectories from a parent's shallow index now correctly fall through to trigger their own indexer job. - subscriptionManager.ts: Pre-register initial listener before calling transport.subscribe() so buffer replay events aren't broadcast to an empty listener Set. - useNormalizedQuery.ts: Seed TanStack Query cache when oldData is undefined, so events arriving before the query response aren't silently dropped by the setQueryData updater. Adds bridge test (Rust harness + TS integration) that reproduces the ephemeral event streaming flow end-to-end.
Updated project description in README.md.
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
- Create core/src/data/ module with SourceManager wrapping sd-archive Engine - Add Sources to GroupType and Source to ItemType enums - Add default Sources group to new library creation - Register source operations: create, list, get, delete, sync, list_items - Register adapter operations: list, config, update - Add bundled adapter sync from workspace adapters/ directory - Add adapter update system with BLAKE3 change detection and backup/rollback - Frontend: Sources home, source detail with virtualized list, adapters screen - Frontend: SourcesGroup sidebar, SpaceGroup dispatch, spaceItemUtils - Frontend: TopBar integration (path bar, search, sync, actions menu) - Frontend: Tab title sync, adapter icon lookup hook - Regenerate TypeScript types Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds the Archive system to Spacedrive v2 - a data archival engine for indexing external sources (emails, notes, messages, etc.) beyond the filesystem.
Key additions:
Architecture
Standalone Crate
Built as
crates/archive/(package:sd-archive) for better CI caching and reusability:Core Integration
Integrates with v2 via library-scoped manager:
Storage Layout
Sources live alongside VDFS in library:
Adapters
Shipped adapters (11 total):
Adapter protocol:
adapters/directoryFeatures
Hybrid Search
Combines two search strategies via Reciprocal Rank Fusion:
Safety Screening
Every record passes through Prompt Guard 2 before becoming searchable:
Schema-Driven
Sources defined by TOML schemas, auto-generate:
Example schema:
License Change: AGPL → FSL
Changed from AGPL-3.0 to FSL-1.1-ALv2 (Functional Source License):
Why FSL:
Additional restrictions added:
Still permitted:
README Rewrite
Simplified and modernized the README:
New tagline: "One file manager for all your devices and clouds"
New opening:
Documentation
Design Doc
docs/core/design/archive.md(1,114 lines)Complete implementation plan:
User Documentation
docs/archive/README.md(403 lines)User-facing guide:
Crate Documentation
crates/archive/README.md(239 lines)Developer reference:
Implementation Status
✅ Completed
Phase 0: Adapters
Phase 1: Standalone Crate
crates/archive/Phase 2: Core Integration
sd-archivedependency to corecore/src/ops/sources/Documentation:
🚧 Next Steps
Phase 2 (continued):
library_sourcestablePhase 3: Jobs & Pipeline
Phase 4: Search
sources.searchqueryPhase 5: UI
Breaking Changes
License
Dependencies
lancedb = "0.15"(vector search)fastembed = "4"(embeddings)ort,tokenizers,hf-hub(safety screening)Testing
Archive Crate
Core Integration
cargo test -p spacedrive-core -- sources::Adapters
Performance
Benchmarks (M2 Max, 10k Gmail messages):
Memory:
Migration Guide
For Users
No migration needed. Archive is a new feature. Existing VDFS data unaffected.
For Developers
New operations available:
New crate available:
Related
docs/core/design/archive.mddocs/archive/README.mdcrates/archive/README.md~/Projects/spacedriveapp/spacedrive-archive-prototype🤖 Generated with Claude Code