Skip to content

Add restore flow: decrypt bundle, import vault, replace DB#618

Merged
lourou merged 43 commits intovault-backup-bundlefrom
vault-restore-flow
Apr 9, 2026
Merged

Add restore flow: decrypt bundle, import vault, replace DB#618
lourou merged 43 commits intovault-backup-bundlefrom
vault-restore-flow

Conversation

@lourou
Copy link
Copy Markdown
Member

@lourou lourou commented Mar 17, 2026

Summary

  • RestoreManager actor orchestrates the full restore flow with progress tracking via RestoreState
  • decrypt backup bundle → import vault archive → save keys to keychain → replace GRDB database → import conversation archives
  • DatabaseManager.replaceDatabase(with:) closes pool, swaps file (+ WAL/SHM cleanup), reopens with migrations — no app restart needed
  • findAvailableBackup() scans iCloud Drive + local fallback for sidecar metadata.json, returns newest backup
  • per-conversation archive import failure is non-fatal (logged, other inboxes still restored)
  • ConvosRestoreArchiveImporter builds temporary XMTP client per inbox via Client.build() for import
  • revocation handled naturally by PR 2 on next inbox launch (installationId change detection)
  • 7 tests (full flow, DB replacement, partial failure, missing vault archive, restore detection, state progression)

Restore flow

RestoreManager.restoreFromBackup(bundleURL:)
  ├── load vault key from VaultKeyStore (iCloud fallback)
  ├── decrypt + unpack bundle to staging dir
  ├── VaultServiceProtocol.importArchive() → [VaultKeyEntry]
  ├── save each key to local keychain
  ├── DatabaseManager.replaceDatabase() (close → swap → reopen + migrate)
  ├── import conversations/<inboxId>.encrypted per inbox
  └── cleanup staging dir

Restore detection

ICloudIdentityStore.hasICloudOnlyKeys()  → vault key in iCloud but not local
RestoreManager.findAvailableBackup()     → newest backup metadata from iCloud Drive

Why

PR 4 from docs/plans/icloud-backup-pr-briefings.md: restore from an encrypted backup bundle created by PR 3.

Stack

Note

Add full backup restore flow with vault key extraction, DB replacement, and inactive conversation mode

  • Introduces RestoreManager, which orchestrates decrypting a backup bundle, importing the vault archive to extract keys, replacing the local GRDB database, importing per-inbox conversation archives, and marking all conversations inactive post-restore.
  • Adds ConvosRestoreArchiveImporter and ConvosVaultArchiveImporter to handle per-inbox and vault archive imports respectively, skipping import when a local XMTP DB already exists.
  • Adds stale device detection: InboxStateMachine marks inboxes stale after auth and on foreground checks via isInstallationActive(); ConversationsViewModel filters stale-inbox conversations, disables start/join actions, and drives a full-stale auto-reset countdown UI.
  • Adds inactive conversation mode: restored conversations are marked isActive = false in ConversationLocalState and reactivated by StreamProcessor when a new message arrives; ConversationView shows an InactiveConversationBanner and blocks sends with an alert.
  • Migrates vault keychain items to synchronizable (iCloud Keychain) entries via KeychainIdentityStore.migrateToSynchronizableIfNeeded on startup.
  • Adds VaultManager.reCreate to tear down and re-bootstrap the vault with fresh keys after restore, and XMTPInstallationRevoker to revoke all other XMTP installations for an inbox.
  • DatabaseManager gains replaceDatabase(with:) which replaces the live pool in-place with rollback on failure.
  • Risk: replaceDatabase swaps the live database pool in-place; if both the replace and the rollback fail, a rollbackFailed error is thrown leaving the DB in an undefined state.

Macroscope summarized 9606433.

@lourou lourou changed the title Add restore flow: decrypt bundle, import vault, replace DB, import archives Add restore flow: decrypt bundle, import vault, replace DB Mar 17, 2026
@claude
Copy link
Copy Markdown

claude bot commented Mar 17, 2026

Claude finished @lourou's task —— View job


Code Review Complete

Tasks

  • Read Macroscope review comments to understand known issues
  • Review core restore flow (RestoreManager, DatabaseManager)
  • Review stale device detection and UI changes
  • Review test coverage
  • Compile comprehensive feedback
  • Post final review

Overview

This is a substantial and well-architected PR implementing backup restore functionality with 4,552 additions across 74 files. The implementation demonstrates strong attention to detail with proper error handling, state management, and test coverage. Macroscope has already identified and resolved all blocking issues - the latest commit shows all previous concerns have been addressed.

Strengths

Architecture & Design

Excellent separation of concerns - RestoreManager orchestrates high-level flow while delegates handle specifics (ConvosVaultArchiveImporter, ConvosRestoreArchiveImporter)
Robust state machine - RestoreState enum provides clear visibility into restore progress
Safe database replacement - DatabaseManager.replaceDatabase(with:) implements rollback on failure with proper WAL/SHM cleanup
Graceful degradation - Per-conversation archive import failures are non-fatal, allowing partial recovery
Platform isolation - Vault archive import uses temporary directory to avoid polluting production XMTP databases

Error Handling & Recovery

Comprehensive error scenarios covered:

  • Missing vault archive aborts before destructive operations (RestoreManager.swift:261-264)
  • Database replacement rollback with migration retry (DatabaseManager.swift:65-80)
  • Per-conversation import failures logged but don't fail entire restore
  • Cleanup guaranteed via defer blocks in critical sections

Lifecycle coordination - RestoreLifecycleControlling protocol ensures sessions are properly stopped/resumed around restore

Stale Device Detection

Well-designed state model - StaleDeviceState (healthy/partialStale/fullStale) provides clear UX semantics
Correct filtering logic - recomputeVisibleConversations() filters from unfiltered source to allow recovery when inboxes return to healthy
Installation verification - Both at authorization and foreground check isInstallationActive() to detect revocation
Auto-reset with escape hatch - Full stale triggers countdown but user can cancel if needed

Test Coverage

7 restore tests covering critical paths:

  • Full flow, DB replacement, partial failure, missing archive, state progression
  • DatabaseManagerRestoreTests verify captured readers remain valid post-restore

7 stale device tests covering state derivation edge cases:

  • Vault exclusion, unused conversation filtering, partial/full transitions

Issues & Recommendations

🟡 Medium Priority

1. ConversationsViewModel initialization race condition

Location: Convos/Conversations List/ConversationsViewModel.swift:247-264

Issue: Macroscope identified this correctly - when observe() subscribes to staleInboxIdsPublisher(), GRDB's ValueObservation emits immediately. This triggers recomputeVisibleConversations() which filters the empty unfilteredConversations and overwrites conversations with [], losing all initially loaded data.

Current code:

self.conversations = try conversationsRepository.fetchAll()  // line 248
// unfilteredConversations remains []

Fix:

let initial = try conversationsRepository.fetchAll()
self.unfilteredConversations = initial  // Seed the source-of-truth
self.conversations = initial

Impact: On launch, conversations list will briefly flash empty before being repopulated by the next publisher emit. This is a UX glitch, not data loss, but should be fixed.


2. explodeConversation() state management order

Location: Convos/Conversations List/ConversationsViewModel.swift:698-700

Issue: Success path removes from hiddenConversationIds before removing from unfilteredConversations, causing a brief flash where the conversation reappears.

Current code:

self.unfilteredConversations.removeAll { $0.id == conversationId }
self.hiddenConversationIds.remove(conversationId)
self.recomputeVisibleConversations()

Expected order: The sequence is actually correct - remove from unfilteredConversations first, then clear the hide marker. The comment in the code at line 695-697 confirms this is intentional. This is not a bug.


🟢 Low Priority / Observations

3. Restore ordering: keychain before database

Location: ConvosCore/Sources/ConvosCore/Backup/RestoreManager.swift:89-102

The current flow:

  1. Wipe local XMTP state (line 93) - deletes keychain
  2. Save keys to keychain (line 97)
  3. Replace database (line 101)

Observation: If replaceDatabase() fails and throws, the rollback restores the original database, but the keychain has already been modified. This creates a state where the database references inbox IDs whose keychain entries were just replaced with backup keys.

Recommendation: Move replaceDatabase() before keychain operations:

Log.info("[Restore] replacing database")
try replaceDatabase(from: stagingDir)

Log.info("[Restore] wiping local XMTP state for clean restore")
await wipeLocalXMTPState()

Log.info("[Restore] saving keys to keychain")
let failedKeyCount = await saveKeysToKeychain(entries: keyEntries)

Rationale: Database replacement has a rollback mechanism; keychain operations don't. If DB replacement fails, we want the keychain to still match the rolled-back database.


4. Vault archive importer cleanup robustness

Location: ConvosCore/Sources/ConvosCore/Backup/ConvosVaultArchiveImporter.swift:70-71

Current code:

let entries = try await extractKeys(from: client)
try? client.dropLocalDatabaseConnection()
try? FileManager.default.removeItem(at: importDir)
return entries

Observation: If extractKeys() throws, cleanup never runs. The connection and temp directory leak.

Fix: Use defer:

defer {
    try? client.dropLocalDatabaseConnection()
    try? FileManager.default.removeItem(at: importDir)
}
let entries = try await extractKeys(from: client)
return entries

5. Test isolation

Location: ConvosCore/Tests/ConvosCoreTests/RestoreManagerTests.swift

Observation: Tests use TestFixtures() which creates temp databases, but some tests (lines 96, 144, 198, 243, 313, 366, 400) have commented cleanup calls:

try? await fixtures.cleanup()

The ? makes cleanup failures silent. If cleanup fails (e.g., file locks), subsequent test runs could see stale data.

Recommendation: Either make cleanup non-optional or add test-level teardown:

deinit {
    Task { try? await fixtures.cleanup() }
}

6. InboxStateMachine stale detection logging

Location: ConvosCore/Sources/ConvosCore/Inboxes/InboxStateMachine.swift:760-770

Observation: When an installation is detected as stale, the code emits a QA event with the inboxId:

QAEvent.emit(.inbox, "stale_detected", ["inboxId": staleInboxId])

Security consideration: Ensure QAEvent doesn't log inboxIds to external services in production builds. InboxIds are not inherently sensitive but could be used for correlation if combined with other data.


Code Quality

Positive Patterns

✅ Extensive use of defer for cleanup (except noted exception)
✅ Descriptive error types with localized descriptions
✅ Comprehensive logging with structured prefixes ([Restore], [Revoke], [Stale])
✅ Actor isolation for RestoreManager prevents data races
✅ Protocol-based design enables testing

Style & Conventions

✅ Follows project's SwiftLint/SwiftFormat rules
✅ No force unwrapping or implicitly unwrapped optionals
✅ Proper use of guard for early returns
✅ Clear naming (recomputeVisibleConversations, wipeLocalXMTPState)


Security

Encryption verified - Backup bundles use AES-256-GCM (from BackupBundle)
No plaintext secrets - Vault keys loaded from keychain, never logged
Path traversal protection - BackupBundle validates extracted paths (addressed in earlier Macroscope review)
Revocation handling - Stale installations properly blocked from sending


Performance

Minimal UI blocking - Restore runs in background with progress state
Efficient DB operations - Uses GRDB transactions, single-pass imports
GRDB ValueObservation - Reactive updates only when data changes
⚠️ Large restore consideration - Importing many conversation archives is serial; consider parallel import if scale demands


Summary

This is high-quality production code that demonstrates strong engineering practices. The Macroscope blocking issues have been resolved, and the remaining observations are minor improvements or low-risk edge cases.

Recommended Actions Before Merge:

  1. FIXED by latest commit - All Macroscope blocking issues resolved
  2. 🟡 Fix initialization race - Seed unfilteredConversations in ConversationsViewModel init (ConversationsViewModel.swift:252)
  3. 🟢 Consider reordering - Move replaceDatabase before keychain operations for better failure atomicity (RestoreManager.swift:89-102)
  4. 🟢 Add defer - Ensure cleanup in ConvosVaultArchiveImporter on exception path (ConvosVaultArchiveImporter.swift:69-71)

The PR is ready to merge after addressing the initialization race condition (#2). The other recommendations are enhancements that can be addressed in follow-up PRs if desired.

Excellent work on this complex feature! 🎉


Copy link
Copy Markdown
Member Author

lourou commented Mar 17, 2026


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

Comment thread ConvosCore/Sources/ConvosCore/Storage/MockDatabaseManager.swift
Comment thread ConvosCore/Sources/ConvosCore/Backup/RestoreManager.swift
@lourou lourou force-pushed the vault-restore-flow branch from 55d5e16 to 9ca0d09 Compare March 17, 2026 15:48
Comment thread ConvosCore/Sources/ConvosCore/Storage/DatabaseManager.swift
Comment thread ConvosCore/Sources/ConvosCore/Backup/BackupBundle.swift Outdated
@lourou lourou force-pushed the vault-restore-flow branch from 68a292d to e50b6cb Compare March 17, 2026 18:46
Comment thread ConvosCore/Sources/ConvosCore/Vault/VaultManager.swift
Comment thread ConvosCore/Sources/ConvosCore/Storage/DatabaseManager.swift
Comment thread ConvosCore/Sources/ConvosCore/Backup/RestoreManager.swift
Comment thread ConvosCore/Sources/ConvosCore/Backup/RestoreManager.swift
@lourou lourou force-pushed the vault-restore-flow branch from 855f84b to 6b8705d Compare March 17, 2026 21:04
Comment thread ConvosCore/Sources/ConvosCore/Backup/RestoreManager.swift
Comment thread ConvosCore/Sources/ConvosCore/Backup/RestoreManager.swift
Comment thread Convos/Debug View/BackupDebugView.swift
lourou added 10 commits April 6, 2026 17:06
…chives

RestoreManager actor orchestrates the full restore:
  1. Load vault key from VaultKeyStore (iCloud fallback)
  2. Decrypt and unpack backup bundle
  3. Import vault archive via VaultServiceProtocol → [VaultKeyEntry]
  4. Save each key entry to local keychain
  5. Replace GRDB database (close pool → swap file → reopen + migrate)
  6. Import per-conversation XMTP archives (failure per-inbox is non-fatal)

RestoreState enum tracks progress through each phase for UI binding.

Restore detection:
  - findAvailableBackup() scans iCloud Drive + local fallback for
    sidecar metadata.json, returns newest backup across all devices
  - Combined with ICloudIdentityStore.hasICloudOnlyKeys() for full
    restore detection (PR 6 UI)

DatabaseManager changes:
  - dbPool changed from let to private(set) var
  - replaceDatabase(with:) closes pool, removes DB + WAL/SHM files,
    copies backup, reopens with migrations
  - DatabaseManagerProtocol gains replaceDatabase requirement
  - MockDatabaseManager implements via erase + backup-to for tests

ConvosRestoreArchiveImporter: concrete implementation builds
temporary XMTP client via Client.build() per inbox and imports

7 tests: full flow, DB replacement, partial failure, missing vault
archive, restore detection, nil detection, state progression
Validate backup opens before closing the active pool. Move the
current DB aside as a safety net. If copy or reopen fails, restore
the original database so the app is never left with no usable DB.
Stage backup into a temp queue first. Only erase the active
in-memory DB after confirming the backup is valid and copyable.
iCloud container available but with no backup was returning nil
instead of falling through to the local directory. Chain the
checks: try iCloud first, fall back to local if not found.
Add failedKeyCount to the completed state so the UI can warn
the user about conversations that may not be decryptable.
saveKeysToKeychain now returns the failure count instead of
throwing.
Shows available backup metadata (timestamp, device, inbox count)
and a destructive 'Restore from backup' button with confirmation
dialog. Restore section only visible when databaseManager is
provided (optional param). Reports failedKeyCount on completion.

Thread databaseManager through DebugExportView → DebugViewSection
→ BackupDebugView as an optional. Expose ConvosClient.databaseManager
as public for debug wiring.
Thread databaseManager from ConvosApp → ConversationsViewModel →
AppSettingsView → DebugExportView → DebugViewSection →
BackupDebugView so the restore button can call
DatabaseManager.replaceDatabase().

All params are optional so previews and mocks still compile
without it.
The path traversal check compared resolved paths using hasPrefix
with a hardcoded '/' separator, but on iOS devices directory URLs
created with isDirectory:true can have a trailing slash in their
resolved path. This caused 'resolvedDirPath + "/"' to become
'.../uuid//' which never matched, rejecting all files including
legitimate ones like database.sqlite.

Extract a resolvedPath() helper that strips trailing slashes after
resolving symlinks. Use it consistently in both tar and untar.
resolvingSymlinksInPath() behaves differently for existing
directories vs non-existent files on iOS: the directory resolves
/private/var → /var but the file keeps /private/var since it
doesn't exist on disk yet. This caused the prefix check to fail.

Build the file path from the already-resolved directory path
instead of resolving it independently. Use standardizedFileURL
(which handles .. without touching the filesystem) instead of
resolvingSymlinksInPath for the file component.
Restore had two independent failure modes in the backup -> restore flow.

First, deleting all app data called VaultManager.unpairSelf(), which cleaned up the local vault state by deleting the vault key through VaultKeyStore.delete(inboxId:). When the underlying store was ICloudIdentityStore, that removed both the local and iCloud keychain copies. Any existing backup bundle had been encrypted with the old vault database key, so a subsequent app relaunch generated a fresh vault identity and restore then failed with AES-GCM authenticationFailure because the app was attempting to decrypt the bundle with a different key.

Second, RestoreManager replaced the GRDB database while the app still held long-lived readers, writers, inbox services, vault activity, and background restore-adjacent tasks. DatabaseManager.replaceDatabase(with:) also recreated the DatabasePool instance entirely. That meant restore was vulnerable to crashing or destabilizing the app because live services and captured dbReader/dbWriter references could become stale during replacement.

This change addresses both problems.

Vault key handling
- add ICloudIdentityStore.deleteLocalOnly(inboxId:) so restore-related local cleanup can preserve the iCloud backup key
- add VaultKeyStore.deleteLocal(inboxId:) as the vault-facing API for that behavior
- keep iCloud key preservation scoped only to the local delete-all-data/unpair path
- keep remote self-removal strict by continuing to delete both local and iCloud copies when this device is removed from the vault by another device

Restore lifecycle coordination
- add RestoreLifecycleControlling to let RestoreManager explicitly quiesce and resume runtime state around destructive restore work
- wire BackupDebugView to pass the current session as the restore lifecycle controller
- make SessionManager conform by pausing the import drainer, sleeping inbox checker, inbox lifecycle, and vault activity before database replacement, then resuming restore-safe pieces and notifying GRDB observers afterward

Database replacement safety
- change DatabaseManager.replaceDatabase(with:) to preserve the existing DatabasePool instead of closing it and constructing a new pool
- copy the current database into an in-memory rollback queue before replacement
- restore the backup contents into the existing pool, then rerun migrations
- on failure, restore the rollback contents back into the same pool and rerun migrations
- add an explicit missing-file guard before attempting replacement

Tests
- add VaultKeyStore coverage proving deleteLocal preserves the iCloud vault copy and full delete still removes both copies
- add RestoreManager coverage proving restore lifecycle hooks are called around restore
- add DatabaseManager restore coverage proving captured readers remain valid after database replacement

Validation
- swift test --package-path ConvosCore --filter '(VaultKeyStoreTests|RestoreManagerTests|DatabaseManagerRestoreTests)'
- xcodebuild build -project Convos.xcodeproj -scheme 'Convos (Dev)' -destination 'id=00008110-001C51283AC2801E' -derivedDataPath .derivedData

Result
- backup decryption can continue to use the original vault key after delete-all-data / app relaunch
- destructive restore now pauses active app state before swapping database contents
- existing GRDB handles remain usable after restore instead of silently pointing at a retired pool
lourou added 5 commits April 6, 2026 17:08
Client.revokeInstallations is a static method that doesn't need
a live client — just the signing key, inboxId, and API config.
This works even when Client.create() fails due to the 10-install
limit, which is exactly when you need to revoke.

XMTPInstallationRevoker: static utility that fetches all
installations via Client.inboxStatesForInboxIds, filters out
the current one, and revokes the rest via the static method.
On a single device, vault key-sharing messages (DeviceKeyBundle)
only exist if multi-device pairing has happened. Without them,
the vault archive is empty of keys and restore can't recover
conversation identities.

Fix: call VaultManager.shareAllKeys() before creating the vault
archive. This broadcasts all conversation keys as a DeviceKeyBundle
message to the vault group, ensuring the vault archive always
contains every key needed for restore — regardless of whether
multi-device pairing has occurred.

This is the intended design: the vault is the single source of
truth for conversation keys, and backups should capture that
truth completely.
After 'delete all data', the vault may not be connected yet.
The backup should still succeed — it just won't include the
vault archive or keys. The conversation archives and GRDB
snapshot are still created.

On the next backup after the vault reconnects, the keys will
be broadcast and the vault archive will be included.
@lourou lourou force-pushed the vault-backup-bundle branch from 65b4b27 to 0ca7fef Compare April 6, 2026 22:22
@lourou lourou force-pushed the vault-restore-flow branch from 9bae672 to 7dded79 Compare April 6, 2026 22:22
Comment thread ConvosCore/Sources/ConvosCore/Auth/Keychain/VaultKeyStore.swift Outdated
Comment thread ConvosCore/Sources/ConvosCore/Storage/MockDatabaseManager.swift
…ility fixes (#626)

* Update copy: 'comes online' -> 'sends a message'

* Add inactive conversation mode (Phases 1 & 2)

Phase 1 — data model:
- Add isActive column to ConversationLocalState (default true)
- GRDB migration addIsActiveToConversationLocalState
- Surface isActive: Bool on Conversation model via local state join
- Add setActive(_:for:) and markAllConversationsInactive() to
  ConversationLocalStateWriter + protocol
- RestoreManager calls markAllConversationsInactive() after
  importing conversation archives

Phase 2 — reactivation detection:
- After each syncAllConversations call site in SyncingManager
  (initial start, resume, discovery), query for conversations
  where isActive == false
- For each, call conversation.isActive() on the XMTP SDK
  (local MLS state check, no network)
- If true, update GRDB via setActive(true, for:)
- GRDB ValueObservation propagates to UI reactively

Phases 3 & 4 (UI) are blocked on design sign-off.

* Remove local settings from tracked files

* Stop tracking .claude/settings.local.json (already gitignored)

* Fix infinite recursion in createArchive protocol bridge

The protocol method createArchive(path:encryptionKey:) was calling
itself instead of the SDK's createArchive(path:encryptionKey:opts:)
because the compiler resolved the overload back to the protocol.
Explicitly cast to XMTPiOS.Client, same pattern as importArchive.

* Fix infinite recursion in importArchive protocol bridge

The cast to XMTPiOS.Client didn't help because the SDK method has
the exact same signature as the protocol method — Swift resolved
it back to the protocol bridge. Call through ffiClient directly
to bypass the protocol dispatch entirely.

* Remove archive protocol bridge methods causing infinite recursion

XMTPiOS.Client satisfies both createArchive and importArchive
protocol requirements directly via its SDK implementations.
No explicit bridge needed, no recursion risk.

* Remove createArchive/importArchive from XMTPClientProvider protocol

These methods don't belong on the messaging protocol. All callers
in the backup/restore layer use XMTPiOS.Client directly, not the
protocol abstraction. ConvosBackupArchiveProvider.buildClient now
returns Client instead of any XMTPClientProvider.

No wrapper, no recursion, no redundant protocol surface.

* Add diagnostic logging for inactive conversation reactivation

- Remove try? in RestoreManager markAllConversationsInactive to
  surface any errors
- Add per-conversation debug logging in reactivation check to
  trace isActive() return values
- Helps diagnose whether importArchive causes isActive() to
  return true immediately (defeating the inactive flag)

* Reactivate conversations on inbound message, change join copy

1. Reactivation: move from isActive() polling in SyncingManager to
   message-driven reactivation in StreamProcessor. When an inbound
   message arrives for a conversation with isActive=false, flip it
   to active immediately. This avoids the issue where importArchive
   causes isActive() to return true on stale MLS state.

2. Copy: change 'You joined as <name>' to 'Reconnected' for the
   system message shown when the current user's installation is
   re-added to a group.

* Revert 'Reconnected' copy change

The summary in ConversationUpdate has no restore context — changing
it here would affect all self-join scenarios, not just post-restore.
The copy change belongs in Phase 4 (UI layer) where isActive is
available to distinguish reconnection from first join.

* Show 'Reconnected' for post-restore self-join messages

Thread isReconnection flag through the message pipeline:
- DBMessage.Update: add isReconnection (default false, backward
  compatible with existing JSON)
- StreamProcessor: when an inbound message arrives for an inactive
  conversation, set isReconnection=true on the stored GroupUpdated
  message before reactivating the conversation
- ConversationUpdate: surface isReconnection from DB hydration
- summary: return 'Reconnected' instead of 'You joined as X'
  when isReconnection is true

Normal first-time joins are unaffected.

* Fix isReconnection decoding for existing messages

Add custom Decodable init to DBMessage.Update that defaults
isReconnection to false when the key is missing. Existing
GroupUpdated messages in the DB were stored before this field
existed, causing keyNotFound decoding errors and blank message
history.

* Document quickname backup as deferred for v1

* Implement inactive conversation UI (Phases 3 & 4)

Phase 3 — conversations list:
- Show 'Awaiting' label with dot separator in subtitle when
  conversation.isActive is false (same pattern as 'Verifying')
- Suppress unread/muted indicators for inactive conversations

Phase 4 — conversation detail:
- InactiveConversationBanner: pill overlay above composer with
  clock.badge.checkmark.fill icon, 'History restored' title,
  subtext explaining the state, tappable to learn.convos.org
- Intercept send, reaction, and reply actions when inactive —
  show 'Awaiting reconnection' alert instead of executing
- Disable top bar trailing item and text field when inactive
- All UI clears reactively when isActive flips to true via
  GRDB ValueObservation (no restart needed)

* Fix InactiveConversationBanner layout to match Figma design

* Update inactive banner: cloud.fill icon, 'Restored from backup'

- Change icon from clock.badge.checkmark.fill to cloud.fill
- Change title from 'History restored' to 'Restored from backup'
- Move banner from overlay into bottomBarContent so it participates
  in bottom bar height measurement — messages get proper inset and
  don't render behind the banner

* Use colorLava for cloud icon in inactive banner

* Fix reactivation for messages arriving via sync path

Messages can arrive through two paths:
1. Message stream → processMessage → markReconnectionIfNeeded ✓
2. Sync/discovery → processConversation → storeWithLatestMessages

Only path 1 had reactivation. Add reactivateIfNeeded after
storeWithLatestMessages in processConversation so the first
message (typically the GroupUpdated 'You joined') properly
clears the inactive state.

* Mark reconnection messages in sync path + re-adopt profiles after restore

1. reactivateIfNeeded now marks recent GroupUpdated messages as
   isReconnection=true before flipping isActive, so the 'You joined'
   message shows 'Reconnected' even when arriving via sync path.

2. ConversationWriter: when conversation is inactive (post-restore),
   force-apply member profiles from XMTP group metadata to GRDB,
   bypassing the 'existing profile has data' skip. This re-adopts
   display names and avatars from the group's custom metadata after
   restore, preventing 'Somebody' fallback.

* Show spinner only on active action during restore

Track which action is running via activeAction string. Spinner
only shows next to the active button. Hide the Actions section
entirely during restore so only the restore row with its spinner
is visible.

* Fix first message after restore not appearing in history

After restore, conversationWriter.store() calls conversation.sync()
which can fail on stale MLS state (new installation not yet re-added
to the group). This threw an error that prevented messageWriter.store
from ever running, so the first inbound message was lost.

Fix: if conversationWriter.store() fails, fall back to the existing
DBConversation from GRDB (which exists from the backup) and continue
storing the message. The conversation will be properly synced on the
next successful sync cycle.

Also refactor reactivateIfNeeded to extract markRecentUpdatesAsReconnection
for clarity.

* Try all vault keys on restore + detailed debug view

RestoreManager: try decrypting the backup bundle with every
available vault key (local + iCloud) instead of just the first.
This handles the case where device B created its own vault key
before iCloud sync propagated device A's key.

VaultKeySyncDebugView: add detailed sections showing:
- Each vault key with inboxId, clientId, and which stores
  (local/iCloud) it exists in, with green checkmark for active
- iCloud backup files with device name, date, size, inbox count

Makes it easy to diagnose vault key divergence between devices.

* Enable iCloud Keychain sync for vault keys

The iCloud vault keychain store was using
kSecAttrAccessibleAfterFirstUnlock but never set
kSecAttrSynchronizable — so items stayed local on each device
even under the same Apple ID.

Add synchronizable flag to KeychainIdentityStore and set it to
true for the iCloud vault store. This ensures vault keys
propagate between devices via iCloud Keychain, so device B can
decrypt backups created by device A.

Updated all iCloud vault store creation sites:
- ConvosClient+App.swift (production)
- BackupDebugView.swift
- VaultKeySyncDebugView.swift

* Add per-key actions in vault debug view

Each vault key now shows inline buttons:
- 'Delete local' — removes only the local copy for that inboxId
- 'Delete iCloud' — removes only the iCloud copy for that inboxId
- 'Adopt locally' — shown for iCloud-only keys, copies to local

This lets you surgically delete the wrong local vault key and
adopt the correct iCloud one without affecting other keys.

* Allow vault clientId update after restore or cross-device sync

After restore or iCloud key sync, the vault inboxId is the same
but the XMTP installation ID (clientId) is different because each
device gets its own installation. Previously this threw
clientIdMismatch and failed vault bootstrap.

Fix:
- InboxWriter: for vault inboxes, update clientId in GRDB instead
  of throwing when it changes
- VaultManager: re-save vault key to keychain with new clientId
  when it differs from the stored one (not just for vault-pending)

* Fix restore spinner not showing

runAction title was 'Restore' but actionLabel title was
'Restore from backup' — the mismatch meant activeAction
never matched and the spinner never appeared.

* Skip quickname onboarding for restored conversations

After restore, the conversation already has a profile from XMTP
group metadata. But UserDefaults (quickname state) aren't backed
up, so the onboarding coordinator thinks the user never set their
name and shows 'Add your name for this convo'.

Skip startOnboarding() entirely when conversation.isActive is
false — the profile is already restored from metadata.

* Update tests to include isActive and isReconnection parameters

MessagesListProcessorTests and related files need the new
parameters added in this branch:
- ConversationUpdate.isReconnection
- ConversationLocalState.isActive

* Fix delete all data leaving orphaned conversations (#628)

* Fix delete all data leaving orphaned conversations

InboxWriter.deleteAll() only deleted inbox rows — conversations,
messages, members, profiles, and all other tables were left behind.
No foreign key cascade from inbox to conversation exists, so the
data persisted and reappeared on next launch.

Now deletes all rows from every GRDB table in dependency order
within a single write transaction.

* Address code review: fix FK ordering, add test, add logging

- Fix foreign key ordering: conversation_members before invite
  (invite has FK referencing conversation_members)
- Add per-table error logging for debugging failed deletions
- Add testDeleteAllRemovesAllData: creates full data graph
  (inbox, conversation, member, profile, local state,
  conversation_members), calls deleteAll(), verifies all empty

* Address code review feedback on PR #626

Three fixes flagged by Macroscope and Claude review:

1. iCloud Keychain migration (critical)
   Add migrateToSynchronizableIfNeeded() that finds existing
   non-synchronizable items and re-saves them with
   kSecAttrSynchronizable: true. Without this, existing users
   would lose access to vault keys after enabling the
   synchronizable flag (queries with synchronizable:true don't
   match non-synchronizable items).
   Called at app launch from ConvosClient+App before the
   VaultKeyStore is created.

2. CancellationError propagation in StreamProcessor
   processMessage catch block for conversationWriter.store now
   rethrows CancellationError instead of falling through to the
   existing-DBConversation fallback. Prevents runaway work after
   task cancellation.

3. InboxWriter vault clientId update early-return
   When vault clientId changed, we were returning early and
   skipping the installationId update logic. Now we continue
   through to update installationId if the caller provided one.

* Add plan for vault re-creation on restore

After restore, the vault cannot be reactivated the same way
conversations can — the vault is the user's own group with no
third party to trigger MLS re-addition. This plan proposes
creating a fresh vault (new inboxId, new keys) on restore,
renaming the old vault DB as a read-only artifact, and
broadcasting the restored conversation keys to the new vault
so it becomes the source of truth for future device pairs.

* Add VaultManager.reCreate method for post-restore vault rebuild

Tears down the current vault and bootstraps a fresh one with new
keys. Used after restore to replace the inactive restored vault
with a working one.

Flow:
  1. Disconnect current vault client
  2. Delete old vault key from local + iCloud keychain
  3. Delete old DBInbox row for the vault
  4. Re-run bootstrapVault() which generates fresh keys and
     creates a new MLS group

The old vault XMTP database file is left on disk as an orphan
to avoid filesystem risk. A future cleanup pass can remove
orphaned vault DB files.

* Wire vault re-creation into RestoreManager

After marking conversations inactive, call VaultManager.reCreate
to tear down the restored (read-only) vault and bootstrap a new
one with fresh keys. Then broadcast the restored conversation
keys to the new vault via shareAllKeys so future device pairs
can sync through the new vault.

VaultManager is optional on RestoreManager to keep existing tests
working without a live vault. The backup debug view injects it
from the session.

If vault re-creation fails, the restore still completes — the
user has working conversations, just no functional vault. They
can retry or manually re-create later.

* Add comprehensive logging for vault re-creation flow

Tag-prefixed logs across the full re-creation path to make it
easy to verify on device:

- [Vault.reCreate] === START === / === DONE === boundaries
- Before/after snapshot of inboxId, installationId, bootstrap
  state, keychain key count, and GRDB vault inbox row count
- Step-by-step 1/4 through 4/4 markers
- Verification log comparing old vs new inboxId (warns loudly
  if they match — indicates failed re-creation)
- [Vault.bootstrap] logs XMTP connect, identity transition,
  keychain persist, and GRDB save
- [Vault.loadOrCreateVaultIdentity] logs whether existing key
  was found or fresh keys generated
- [VaultKeyCoordinator.shareAllKeys] logs row count, packaged
  key count, missing identities, and send success
- [Restore.reCreateVault] mirrors the same pattern on the
  restore side

* Update plans: revoke old vault in reCreate + stale device detection

- vault-re-creation: add step 8a to revoke device A's installation
  from the old vault before disconnecting, mirroring the per-
  conversation revocation that already happens. Steps re-lettered
  8a through 8g.
- stale-device-detection: new plan for detecting when the current
  device has been revoked (via client.inboxState refresh from
  network) and showing a 'device replaced' banner with a delete-
  data CTA. Covers detection triggers, storage, and UI.

* Revoke old vault installations before re-creating

Step 1/5 of VaultManager.reCreate now calls
revokeAllOtherInstallations on the old vault before disconnecting,
mirroring the per-conversation revocation that already happens
on restored conversation inboxes.

This kills device A's zombie vault installation so the old vault
is fully decommissioned (not just orphaned). Combined with the
conversation-level revocations, device A is now completely
replaced after restore.

Failure is non-fatal — logged as a warning, re-creation
continues. Worst case, device A keeps a zombie vault installation,
but the new vault on device B still works.

* Add isStale column to DBInbox with migration

Foundation for stale device detection: a per-inbox flag
indicating the current device's installation has been revoked
(e.g., because the user restored on a different device).
Defaults to false, nullable migration for existing rows.

* Add stale installation detection in InboxStateMachine

After inbox reaches ready state, call client.isInstallationActive
which queries XMTP inboxState(refreshFromNetwork: true) and checks
if our installationId is still in the active installations list.

If not → device was revoked (typically because user restored on
another device) → mark the inbox as stale in GRDB, emit QA event.

New XMTPClientProvider method isInstallationActive() wraps the
SDK call and returns Bool for clarity.

Added 'inbox' category to QAEvent for the stale_detected event.

* Also check stale installation on app foreground

When the app returns to foreground, inboxes might have been
revoked on another device (e.g., user restored on device B
while device A was backgrounded). Re-run isInstallationActive
check in handleEnterForeground to catch this on return.

* Surface isStale on Inbox model + reactive publisher

- Add isStale to Inbox domain model and DBInbox hydration
- Add InboxesRepository.anyInboxStalePublisher that emits true
  when any non-vault inbox has isStale = true
- Subscribe in ConversationsViewModel, expose isDeviceStale
  property for the view layer to bind to

* Add StaleDeviceBanner UI + hide stale conversations

- New StaleDeviceBanner view: exclamation icon, title, explanation,
  destructive 'Delete data and start fresh' button (wires into
  existing deleteAllData flow)
- Banner inserted above conversations list via safeAreaInset(edge:
  .top) when viewModel.isDeviceStale is true
- ConversationsViewModel: subscribe to staleInboxIdsPublisher and
  filter out any conversations belonging to stale inboxes, both
  on initial load and on reactive updates from GRDB
- Inline the isInstallationActive check in handleStart /
  handleEnterForeground to avoid Sendable issues passing
  any XMTPClientProvider across actor boundaries

* Fix synchronizable migration data loss risk

Macroscope flagged the migration as losing data if SecItemAdd
fails after the original item is deleted. Reverse the order:
add the synchronizable copy first, then delete the
non-synchronizable original — matching the pattern in
migrateItemToPlainAccessibility. This ensures the data is
never left unrecoverable.

If add fails: original is intact, retry next launch.
If delete fails after successful add: synchronizable copy is
in place, retry next launch (idempotent — the second add will
succeed with errSecDuplicateItem).

* Fail VaultManager.reCreate when old key deletion fails or inboxId matches

Two related fixes flagged by Macroscope:

1. If vaultKeyStore.delete(inboxId:) throws during step 3/5,
   we previously logged a warning and continued. The next
   bootstrap then found the old identity and reused it,
   producing the same inboxId — meaning no new vault was
   actually created. Now throws VaultReCreateError.bootstrapFailed.

2. If after bootstrap the new inboxId matches the old one
   (because deletion didn't actually remove the key), throw
   instead of just logging a warning. The function previously
   returned successfully despite re-creation having failed.

Both errors propagate up to RestoreManager.reCreateVault which
already catches and logs without aborting the restore — the
user can still see their conversations, just with no functional
vault, and can retry.

* decryptBundle: don't let staging dir reset failure escape the loop

If FileManager.createDirectory throws (e.g. disk full) between
key attempts, the error escaped the for loop and we surfaced
that confusing error instead of RestoreError.decryptionFailed.
Catch and log so the loop continues to the next key.

* shareAllKeys: only mark successfully shared inboxes

When identityStore.identity(for:) failed for an inbox, it was
skipped from the keys array but still marked sharedToVault=1
in the database. This caused inboxes whose keys were never
actually shared to be flagged as shared, preventing future
sync attempts.

Now we mark only the inboxIds that ended up in the keys array.

* Clear conversation selection when its inbox becomes stale

When staleInboxIdsPublisher fires and removes conversations from
the visible list, clear _selectedConversationId if the previously
selected conversation is no longer present. Otherwise the detail
view shows a conversation that no longer appears in the list.

Also fix the existing conversationsPublisher selection check to
inspect the filtered list (self.conversations) rather than the
unfiltered parameter, so stale selections are properly cleared
on either trigger.

* Fix vault re-create orphan rows + add inactive mode tests

Macroscope flagged a real bug at VaultManager:317: when oldInboxId
is nil, step 3 deletes all vault keys from the keychain but step 4
skipped DBInbox cleanup. The next bootstrapVault would fail with
InboxWriterError.duplicateVault. Now step 4 deletes all vault rows
when oldInboxId is nil, matching the keychain cleanup behavior.
Also escalated step 4 errors to throw VaultReCreateError.

Test coverage for previously-untested inactive mode logic:
- ConversationLocalStateWriterTests (5 tests):
  setActive toggle, scoped to one conversation, throws on
  unknown conversation, markAllConversationsInactive bulk
  flip, no-op on empty database
- InboxWriterTests adds 2 tests for markStale (toggle + scoping)
- DBMessageUpdateDecodingTests (4 tests):
  legacy JSON without isReconnection defaults to false,
  decode with isReconnection=true preserves value, encode
  round-trip, memberwise init default

46/46 tests pass in the affected suites.

* Refine stale-device UX and update recovery policy plan

* Add StaleDeviceState enum + reactive publisher

Derives healthy/partialStale/fullStale from the set of used
non-vault inboxes and how many are flagged isStale=true:
- healthy: no stale inboxes (or no used inboxes)
- partialStale: some but not all used inboxes are stale
- fullStale: every used inbox is stale (device effectively dead)

Drives the partial-vs-full UX policy described in
docs/plans/stale-device-detection.md. Existing
anyInboxStalePublisher and staleInboxIdsPublisher are kept
for backward compatibility.

* ConversationsViewModel: derive partial vs full stale state

Replace single isDeviceStale boolean with StaleDeviceState enum
driving:
- isDeviceStale: any inbox stale (banner visibility)
- isFullStale: every inbox stale
- canStartOrJoinConversations: false only when fullStale (was
  blocking new convos in any stale state previously — now allows
  them in healthy inboxes during partial stale)

Add handleStaleStateTransition that logs every transition and
schedules an auto-reset on healthy/partial → fullStale via
isPendingFullStaleAutoReset. The view layer binds to this flag
to render a cancellable countdown UI.

Add cancelFullStaleAutoReset / confirmFullStaleAutoReset for the
UI to call from the countdown.

* Stale device UI: partial vs full copy, countdown auto-reset

StaleDeviceBanner now takes a StaleDeviceState and renders
distinct copy + title for partial vs full:
- Partial: 'Some conversations moved to another device' / 'Some
  inboxes have been restored on another device. Resetting will
  clear local data...'
- Full: 'This device has been replaced' / 'Your account has been
  restored on another device. Resetting will clear local data...'

Action button renamed from 'Delete data and start fresh' to
'Reset device' so the destructive intent is in the verb itself
(per plan: 'Continue' was misleading).

StaleDeviceInfoView (Learn more sheet) now also varies copy by
state with two distinct title + paragraph sets.

New FullStaleAutoResetCountdown component shown via
fullScreenCover when ConversationsViewModel.isPendingFullStaleAutoReset
fires. 5-second countdown with Cancel and 'Reset now' actions.
The countdown task auto-fires onReset when remaining hits 0.

Wired into ConversationsView so when staleDeviceState transitions
to fullStale, the cover automatically presents.

* Add tests for StaleDeviceState derivation

8 tests covering the partial-vs-full state matrix:
- healthy when no inboxes
- healthy when inbox has no conversations
- healthy when used inbox is not stale
- partialStale when one of two used inboxes is stale
- fullStale when every used inbox is stale
- vault inbox is excluded from the count
- unused conversations don't make an inbox 'used'
- StaleDeviceState convenience flags (hasUsableInboxes,
  hasAnyStaleInboxes)

54/54 tests pass in affected suites.

* Vault bootstrap: log when installation is stale (diagnostic only)

After bootstrap, query inboxState(refreshFromNetwork: true) and
log a warning + QAEvent if our installationId isn't in the active
list. This means another device restored from backup and revoked
this vault installation.

The conversation-level stale detection (InboxStateMachine) is
what drives user-facing recovery — when a device is fully
replaced, all conversation inboxes go stale and the auto-reset
fires before vault state matters in practice.

This log is purely for diagnosing rare partial-revocation states
during QA. We don't add user-facing UI for vault staleness
because:
- in full-stale, conversation-level reset wipes the device anyway
- in partial-stale, the user isn't trying to use vault-dependent
  features in normal flows
- the only user-visible impact is 'pair new device' silently
  failing, which is rare and easily explained

* Don't dismiss compose flow on partial stale inbox updates

The staleInboxIdsPublisher sink was unconditionally nilling
newConversationViewModel whenever isDeviceStale was true, which
returns true for both partialStale and fullStale states. This
contradicted handleStaleStateTransition which explicitly keeps
the compose flow alive during partial stale so the user can
finish creating a conversation in a still-healthy inbox.

Remove the redundant check entirely — handleStaleStateTransition
is the single source of truth for closing in-flight flows on
transition to fullStale. The sink now only handles visible-list
filtering and selection clearing.
Comment thread ConvosCore/Sources/ConvosCore/Storage/Models/Inbox.swift
Comment thread ConvosCore/Sources/ConvosCore/Vault/VaultManager.swift Outdated
Comment thread Convos/Conversations List/ConversationsViewModel.swift Outdated
lourou added 6 commits April 8, 2026 15:47
Two low-severity findings on the restore flow:

1. VaultKeyStore.deleteLocal: the non-iCloud fallback called
   store.delete(inboxId:) which permanently deletes the key,
   contradicting the documented purpose of 'preserving the
   iCloud copy'. In production we always use ICloudIdentityStore
   so this only mattered in tests/mocks, but the contract was
   wrong. Now it's a no-op with a warning when the underlying
   store isn't iCloud-backed; tests that need to verify deletion
   should call delete(inboxId:) directly.

2. MockDatabaseManager.replaceDatabase didn't run migrations
   after restoring the database, unlike DatabaseManager which
   calls SharedDatabaseMigrator.shared.migrate after restore.
   This caused tests to pass while production could fail on
   schema mismatches. Now runs the migrator post-restore to
   match production behavior.
Macroscope flagged that the existing standardizedFileURL check
doesn't resolve symlinks, but fileData.write(to:) follows them.
A pre-existing symlink under the staging dir could escape the
first-pass check.

Fix: after creating the parent directory, re-validate with
.resolvingSymlinksInPath() on the parent. If the resolved path
is no longer inside the staging dir (because a symlink redirected
it), throw 'path traversal via symlink'.
Macroscope flagged that importVaultArchive returned an empty
array when the vault archive was missing from the bundle, but
restoreFromBackup would still continue to wipe local XMTP state
and replace the database — silent data loss with no keys to
decrypt the resulting conversations.

The RestoreError.missingVaultArchive case was already defined
but never thrown. Throw it now before any destructive operation.
wipeLocalXMTPState was logging 'cleared conversation keychain
identities' unconditionally after try? await identityStore.deleteAll().
The success message appeared even when deleteAll threw and no
identities were actually cleared, hiding restore failures during
debugging.

Convert to do/catch and log a warning on failure (non-fatal —
restore continues either way).
Macroscope flagged that if the restore failed AND the rollback
also failed (e.g. migration threw on the rolled-back snapshot),
the catch block swallowed the rollback error and re-threw only
the original. Caller had no way to know the DB was now in a
potentially-inconsistent state.

Add DatabaseManagerError.rollbackFailed wrapping both the
original and rollback errors. Throw it instead of the original
when the rollback path itself throws — caller can distinguish
'restore failed cleanly' from 'restore failed and we couldn't
recover'.
The 'missing vault archive is non-fatal' test was asserting the
old behavior. Updated to assert the restore aborts before any
destructive operation runs and the conversation archive importer
is never invoked.
@lourou lourou marked this pull request as ready for review April 8, 2026 20:59
@macroscopeapp
Copy link
Copy Markdown
Contributor

macroscopeapp bot commented Apr 8, 2026

Approvability

Verdict: Needs human review

Diff is too large for automated approval analysis. A human reviewer should evaluate this PR.

You can customize Macroscope's approvability policy. Learn more.

lourou added 4 commits April 8, 2026 16:37
Macroscope and Claude flagged that adding isStale as a non-optional
stored property broke Codable backwards compatibility — Swift's
synthesized Decodable ignores default init parameter values, so any
JSON payload missing 'isStale' would fail to decode with keyNotFound.

Inbox is currently only hydrated from GRDB columns (not JSON), so
there's no active bug, but the Codable conformance is a footgun for
any future JSON path (cache, push payload, etc).

Add a custom init(from:) that uses decodeIfPresent for isStale with
a 'false' fallback. No API change.
Macroscope and Claude flagged the delete-then-save-with-try? pattern
when the vault identity is being persisted/updated. If the delete
succeeded but the save failed (keychain access denied, etc), the
vault key was permanently lost — the next bootstrap would generate
a new inboxId and InboxWriter.save would throw duplicateVault
because the old GRDB row still existed.

Fix: save the new/updated entry first (which will throw on failure
and abort the bootstrap with the old key still intact), then delete
the old entry only if save succeeded. Delete failures are now
non-fatal because the vault is already recoverable from the new
entry — they just leave an orphan keychain item that gets cleaned
up on the next cycle.

Mirrors the same add-first-then-delete pattern we use for the
iCloud Keychain synchronizable migration.
Macroscope and Claude flagged a real UX bug: when an inbox recovers
from stale back to healthy, its conversations remained permanently
hidden. The staleInboxIdsPublisher sink was filtering
self.conversations in-place with
  self.conversations = self.conversations.filter { ... }
which destroyed the source list. When a smaller set of stale IDs
later emitted, the filter ran against an already-filtered list and
the recovered conversations never reappeared. conversationsPublisher
observes a different table and only re-emits on DB change, so
nothing triggered re-hydration.

Fix: cache the publisher's output in unfilteredConversations and
derive the visible list via recomputeVisibleConversations(), called
from both the stale-ids sink and the conversations sink. Also call
it after hiddenConversationIds.remove so post-explosion rollback
behaves correctly.
Macroscope flagged that 'throw CancellationError()' inside the inner
do/catch for conversationWriter.store was immediately caught by the
outer generic 'catch {}' and logged as an error rather than
propagated — defeating cooperative cancellation.

Since processMessage is declared 'async' (not 'async throws') and
changing the signature would ripple across the StreamProcessor
protocol and six call sites, handle it locally: add a dedicated
'catch is CancellationError' arm to both do/catch blocks that logs
at debug level and returns early.

The enclosing runMessageStream loop calls Task.checkCancellation()
at the top of every iteration, so cancellation still exits the
stream task — just one message later than ideal, which is
acceptable for cooperative cancellation semantics.
Comment thread Convos/Conversations List/ConversationsViewModel.swift
Comment thread Convos/Conversations List/ConversationsViewModel.swift
…ordering

Two follow-up bugs from the unfilteredConversations refactor that
Macroscope caught:

1. (Medium) init populated self.conversations from fetchAll() but
   left unfilteredConversations empty. When observe() subscribed to
   staleInboxIdsPublisher, GRDB's ValueObservation fired immediately,
   triggering recomputeVisibleConversations against an empty source
   and wiping the initial list. Seed both lists from the fetch.

2. (Low) explodeConversation was mutating self.conversations in
   place with removeAll, bypassing the unfiltered cache, and on
   success the conversation would briefly resurface because
   recomputeVisibleConversations filtered an unfiltered list that
   still contained it.

   Rework to the optimistic-hide pattern: insert into
   hiddenConversationIds and recompute (hides immediately), then on
   success drop from unfilteredConversations BEFORE clearing the
   hide marker (so the recompute doesn't resurface it), or on
   failure just clear the hide marker to bring it back.
@lourou lourou merged commit f6c930f into vault-backup-bundle Apr 9, 2026
7 of 10 checks passed
@lourou lourou deleted the vault-restore-flow branch April 9, 2026 16:28
lourou added a commit that referenced this pull request Apr 13, 2026
* Add restore flow: decrypt bundle, import vault, replace DB, import archives

RestoreManager actor orchestrates the full restore:
  1. Load vault key from VaultKeyStore (iCloud fallback)
  2. Decrypt and unpack backup bundle
  3. Import vault archive via VaultServiceProtocol → [VaultKeyEntry]
  4. Save each key entry to local keychain
  5. Replace GRDB database (close pool → swap file → reopen + migrate)
  6. Import per-conversation XMTP archives (failure per-inbox is non-fatal)

RestoreState enum tracks progress through each phase for UI binding.

Restore detection:
  - findAvailableBackup() scans iCloud Drive + local fallback for
    sidecar metadata.json, returns newest backup across all devices
  - Combined with ICloudIdentityStore.hasICloudOnlyKeys() for full
    restore detection (PR 6 UI)

DatabaseManager changes:
  - dbPool changed from let to private(set) var
  - replaceDatabase(with:) closes pool, removes DB + WAL/SHM files,
    copies backup, reopens with migrations
  - DatabaseManagerProtocol gains replaceDatabase requirement
  - MockDatabaseManager implements via erase + backup-to for tests

ConvosRestoreArchiveImporter: concrete implementation builds
temporary XMTP client via Client.build() per inbox and imports

7 tests: full flow, DB replacement, partial failure, missing vault
archive, restore detection, nil detection, state progression

* Make DatabaseManager.replaceDatabase safe with rollback

Validate backup opens before closing the active pool. Move the
current DB aside as a safety net. If copy or reopen fails, restore
the original database so the app is never left with no usable DB.

* Validate backup before erasing MockDatabaseManager

Stage backup into a temp queue first. Only erase the active
in-memory DB after confirming the backup is valid and copyable.

* Fix findAvailableBackup to check local when iCloud is empty

iCloud container available but with no backup was returning nil
instead of falling through to the local directory. Chain the
checks: try iCloud first, fall back to local if not found.

* Track failed key saves in RestoreState.completed

Add failedKeyCount to the completed state so the UI can warn
the user about conversations that may not be decryptable.
saveKeysToKeychain now returns the failure count instead of
throwing.

* Add restore button to backup debug panel

Shows available backup metadata (timestamp, device, inbox count)
and a destructive 'Restore from backup' button with confirmation
dialog. Restore section only visible when databaseManager is
provided (optional param). Reports failedKeyCount on completion.

Thread databaseManager through DebugExportView → DebugViewSection
→ BackupDebugView as an optional. Expose ConvosClient.databaseManager
as public for debug wiring.

* Wire databaseManager through to backup debug panel for restore

Thread databaseManager from ConvosApp → ConversationsViewModel →
AppSettingsView → DebugExportView → DebugViewSection →
BackupDebugView so the restore button can call
DatabaseManager.replaceDatabase().

All params are optional so previews and mocks still compile
without it.

* Fix false path traversal rejection on iOS devices

The path traversal check compared resolved paths using hasPrefix
with a hardcoded '/' separator, but on iOS devices directory URLs
created with isDirectory:true can have a trailing slash in their
resolved path. This caused 'resolvedDirPath + "/"' to become
'.../uuid//' which never matched, rejecting all files including
legitimate ones like database.sqlite.

Extract a resolvedPath() helper that strips trailing slashes after
resolving symlinks. Use it consistently in both tar and untar.

* Fix untar path resolution inconsistency on iOS devices

resolvingSymlinksInPath() behaves differently for existing
directories vs non-existent files on iOS: the directory resolves
/private/var → /var but the file keeps /private/var since it
doesn't exist on disk yet. This caused the prefix check to fail.

Build the file path from the already-resolved directory path
instead of resolving it independently. Use standardizedFileURL
(which handles .. without touching the filesystem) instead of
resolvingSymlinksInPath for the file component.

* Fix vault backup restore after local wipe and stabilize DB restore

Restore had two independent failure modes in the backup -> restore flow.

First, deleting all app data called VaultManager.unpairSelf(), which cleaned up the local vault state by deleting the vault key through VaultKeyStore.delete(inboxId:). When the underlying store was ICloudIdentityStore, that removed both the local and iCloud keychain copies. Any existing backup bundle had been encrypted with the old vault database key, so a subsequent app relaunch generated a fresh vault identity and restore then failed with AES-GCM authenticationFailure because the app was attempting to decrypt the bundle with a different key.

Second, RestoreManager replaced the GRDB database while the app still held long-lived readers, writers, inbox services, vault activity, and background restore-adjacent tasks. DatabaseManager.replaceDatabase(with:) also recreated the DatabasePool instance entirely. That meant restore was vulnerable to crashing or destabilizing the app because live services and captured dbReader/dbWriter references could become stale during replacement.

This change addresses both problems.

Vault key handling
- add ICloudIdentityStore.deleteLocalOnly(inboxId:) so restore-related local cleanup can preserve the iCloud backup key
- add VaultKeyStore.deleteLocal(inboxId:) as the vault-facing API for that behavior
- keep iCloud key preservation scoped only to the local delete-all-data/unpair path
- keep remote self-removal strict by continuing to delete both local and iCloud copies when this device is removed from the vault by another device

Restore lifecycle coordination
- add RestoreLifecycleControlling to let RestoreManager explicitly quiesce and resume runtime state around destructive restore work
- wire BackupDebugView to pass the current session as the restore lifecycle controller
- make SessionManager conform by pausing the import drainer, sleeping inbox checker, inbox lifecycle, and vault activity before database replacement, then resuming restore-safe pieces and notifying GRDB observers afterward

Database replacement safety
- change DatabaseManager.replaceDatabase(with:) to preserve the existing DatabasePool instead of closing it and constructing a new pool
- copy the current database into an in-memory rollback queue before replacement
- restore the backup contents into the existing pool, then rerun migrations
- on failure, restore the rollback contents back into the same pool and rerun migrations
- add an explicit missing-file guard before attempting replacement

Tests
- add VaultKeyStore coverage proving deleteLocal preserves the iCloud vault copy and full delete still removes both copies
- add RestoreManager coverage proving restore lifecycle hooks are called around restore
- add DatabaseManager restore coverage proving captured readers remain valid after database replacement

Validation
- swift test --package-path ConvosCore --filter '(VaultKeyStoreTests|RestoreManagerTests|DatabaseManagerRestoreTests)'
- xcodebuild build -project Convos.xcodeproj -scheme 'Convos (Dev)' -destination 'id=00008110-001C51283AC2801E' -derivedDataPath .derivedData

Result
- backup decryption can continue to use the original vault key after delete-all-data / app relaunch
- destructive restore now pauses active app state before swapping database contents
- existing GRDB handles remain usable after restore instead of silently pointing at a retired pool

* Tighten restore barriers and exclude unused inboxes from backup counts

This follow-up hardens the restore path and fixes backup metadata/archive counts that were inflated by the pre-created unused conversation inbox.

Restore hardening
- wrap destructive database replacement work in DatabasePool.barrierWriteWithoutTransaction so the restore waits for in-flight GRDB accesses on the live pool to drain before copying rollback data and before copying restored data into place
- keep the existing in-place DatabasePool replacement strategy, but make the source->destination copy and rollback copy happen behind an explicit GRDB barrier
- keep migrations after replacement so schema is still reopened at the current app version
- keep rollback-on-failure behavior intact

Session restore follow-up
- when prepareForRestore runs, clear the cancelled unused inbox preparation task reference
- when finishRestore runs, schedule unused inbox preparation again so the post-restore app returns to the same steady-state behavior as a normal session

Backup count / archive fix
- add InboxesRepository.nonVaultUsedInboxes()
- this returns only non-vault inboxes that have at least one non-unused conversation row
- BackupManager now uses that query when deciding which per-conversation archives to create
- RestoreManager now uses the same query for its completed inbox count

This keeps the vault backup behavior unchanged and only excludes the synthetic unused conversation inbox from the non-vault conversation archive path.

Important: the vault is still always backed up.
- BackupManager still creates the vault archive first
- the overall backup bundle is still encrypted with the vault databaseKey
- restore still depends on the vault archive plus the vault key to recover conversation keys

What changes from the user-visible perspective
- backup metadata inboxCount now matches real visible/restorable conversations instead of counting the hidden unused placeholder inbox
- restore completed counts now match that same definition
- the extra placeholder inbox is no longer archived as if it were a real conversation
- the vault remains part of every backup exactly as before

Tests added/updated
- backup tests now seed real conversation rows for active inboxes
- add coverage that unused conversation inboxes are excluded from backup metadata and archive generation
- add coverage that restore completed counts exclude unused conversation inboxes
- existing restore/database tests continue to cover lifecycle hooks and stable DB handle behavior

Validation
- swift test --package-path ConvosCore --filter '(BackupManagerTests|RestoreManagerTests|VaultKeyStoreTests|DatabaseManagerRestoreTests)'
- xcodebuild build -project Convos.xcodeproj -scheme 'Convos (Dev)' -destination 'id=00008110-001C51283AC2801E' -derivedDataPath .derivedData

* Fix vault archive import crash by using temporary XMTP client

The live VaultClient's DB connection was in an inconsistent state
after pause() (dropLocalDatabaseConnection), causing importArchive
to crash in the XMTP FFI layer.

Replace the live vaultService.importArchive call with a temporary
XMTP client built from the vault key. This matches the real restore
scenario (fresh device, no running vault) and avoids conflicts with
the live vault's connection state.

- Extract VaultArchiveImporter protocol for testability
- ConvosVaultArchiveImporter: builds temp client, imports archive,
  extracts keys from vault messages, tears down
- Remove vaultService dependency from RestoreManager entirely
- Revert barrierWriteWithoutTransaction (deadlock) back to plain
  backup(to:) for DB replacement
- Update tests with MockVaultArchiveImporter

* Fall back to Client.create when Client.build fails during restore

On a fresh device (real restore), no local XMTP database exists,
so Client.build fails. Fall back to Client.create with the signing
key, same pattern as InboxStateMachine.handleAuthorize.

Also add vault-specific codecs to the temporary client options so
vault messages can be decoded for key extraction.

Applied to both ConvosVaultArchiveImporter and
ConvosRestoreArchiveImporter.

* Extract keys from restored GRDB instead of live vault import

The XMTP SDK's importArchive crashes when called on a client
whose local database already has state (same-device scenario) or
when called on a just-created client. The vault messages are
already in the backup's GRDB database, so we don't need the XMTP
archive import to extract keys.

New restore order:
1. Replace GRDB database with backup copy
2. Extract keys from vault messages in the restored DB
3. Save keys to keychain
4. Import vault XMTP archive (best-effort, non-fatal)
5. Import conversation XMTP archives

The vault XMTP archive import is now best-effort — if it fails
(e.g. on same-device test), the restore still succeeds because
key extraction comes from GRDB, not from the XMTP archive.

* Skip vault XMTP archive import to avoid FFI crash

The XMTP SDK's importArchive causes a Rust panic (uncatchable from
Swift) when importing into a client that already has state. Since
key extraction now reads from the restored GRDB database, the vault
XMTP archive import is unnecessary. The vault will rebuild its XMTP
state when it reconnects on next launch.

* Extract keys from live vault before teardown, not after

Vault messages (DeviceKeyBundle/DeviceKeyShare) live in the XMTP
database, not GRDB. Restoring the GRDB database doesn't restore
vault messages. A fresh temporary XMTP client can't access
historical vault group messages either (different installation).

Fix: extract keys from the LIVE connected vault client while it's
still running, BEFORE the lifecycle teardown. The vault already
has all the messages in its local XMTP database.

New restore order:
1. Decrypt bundle
2. Extract keys from LIVE vault (VaultManager.extractKeys)
3. Stop sessions (prepareForRestore)
4. Replace GRDB database
5. Save extracted keys to keychain
6. Import conversation XMTP archives
7. Resume sessions (finishRestore)

* Use XMTP archive import as designed — no DB swaps or live extraction

The XMTP SDK's importArchive is additive and designed for
backup/restore across installations. Use it correctly:

Restore flow:
1. Decrypt bundle
2. Stop all sessions
3. Import vault archive: Client.create(signingKey) then
   importArchive(path, key) — creates a fresh XMTP client and
   imports vault message history including all key bundles/shares
4. Extract keys from imported vault messages
5. Save keys to keychain
6. Replace GRDB database
7. Import conversation archives: Client.create(signingKey) then
   importArchive(path, key) per conversation
8. Resume sessions

Key fixes:
- Vault importer uses NO dbDirectory (matches VaultManager's
  default = app Documents), was wrongly using app group container
- Conversation importer uses defaultDatabasesDirectory (matches
  InboxStateMachine's app group container)
- Both use Client.create (not build) since we want fresh
  installations — importArchive is additive
- Removed vaultService dependency from RestoreManager entirely
- After import, conversations start inactive/read-only per MLS
  and reactivate automatically as other members come online

* Use isolated temp directory for vault archive import

importArchive crashes (Rust panic) when the vault's XMTP database
already exists on disk with the same data. Create a fresh temporary
directory for the import client so there's no conflict with the
live vault's database files.

This works for both same-device testing (existing DB) and real
restore (fresh device, no DB). The temp directory is cleaned up
after key extraction.

* Disable device sync workers for archive import clients

The device sync worker was running concurrently with importArchive,
causing a race condition in the Rust FFI layer that crashed the app.
Archive import clients are temporary — they only need to import
data and extract keys, not participate in device sync.

* Add import success log for vault archive

* Skip vault archive import when conversation keys already in keychain

Vault importArchive crashes when called on a device that already has
a live installation for the same inboxId. More importantly, it's
unnecessary: if conversation keys are already in the local keychain,
the vault archive contains nothing we don't have.

Before attempting vault archive import, check which conversation
inboxIds from the bundle are missing from the keychain:

  Same device (all keys present):
    → skip vault archive import entirely
    → restore GRDB + import conversation archives only

  Fresh device or empty vault (keys missing):
    → import vault archive to recover missing keys
    → save extracted keys to keychain
    → restore GRDB + import conversation archives

This correctly handles all scenarios including a device where the
vault bootstrapped from iCloud Keychain but its XMTP database is
empty (no DeviceKeyBundle/Share messages yet).

* Skip conversation archive import when XMTP DB already exists

importArchive is additive — it restores message history into a
fresh installation. If the XMTP DB already exists (same device),
Client.build() succeeds and there is nothing to import.

Same-device: build() succeeds → skip importArchive, history is there
Fresh device: build() fails → create() + importArchive()

This mirrors the vault import logic and avoids the FFI crash from
importing into an existing installation.

* Skip vault importArchive when vault XMTP DB already exists

Same pattern as conversation archives: try Client.build() first.
If vault DB exists on disk, extract keys directly from the
existing vault database instead of importing the archive.
Only import the archive on a fresh device where build() fails.

This fixes the intermittent crash that occurred when the GRDB
restore caused some conversation keys to appear missing from
the keychain, triggering vault importArchive on same-device.

* Wipe local XMTP state before restore for clean deterministic behavior

Restore is destructive: the device should look exactly like the
backup, nothing more, nothing less. Partial state from the current
device (deleted conversations, orphaned DBs, mismatched keys)
was causing intermittent crashes and inconsistent behavior.

Before importing vault archive:
- identityStore.deleteAll() removes all conversation keychain keys
- Delete all xmtp-* files from AppGroup + Documents directories

After wipe, Client.build() fails for all inboxes (no local DB),
so both vault and conversation importers always take the
create() + importArchive() path — consistent regardless of
device state.

The vault key (vaultKeyStore) is preserved since it lives in a
separate keychain service and is needed to decrypt the bundle.

* Extract vault keys before wipe to avoid broken state on failure

If vault archive import fails after the wipe, the device is left
with no keys and no restored data — unrecoverable. Extract keys
first while the vault is still intact, then wipe once we have
something to restore.

New order:
1. Stop sessions
2. Import vault archive + extract keys (vault DB still present)
3. Wipe XMTP DBs + conversation keychain identities
4. Save extracted keys to keychain
5. Replace GRDB
6. Import conversation archives (fresh DBs)
7. Resume sessions

* Don't wipe vault XMTP DB during restore + add revoke debug button

Don't wipe vault XMTP DB (Documents/):
  Only conversation DBs (AppGroup/) need wiping. The vault DB is
  preserved since it was just used to extract keys, and wiping it
  forces Client.create() which registers a new installation —
  hitting the 10-installation limit during repeated testing.

Add revoke button to debug panel:
  'Revoke all other vault installations' clears accumulated test
  installations from the XMTP network, freeing up slots.
  VaultManager.revokeAllOtherInstallations() delegates to the
  live vault XMTP client.

* Use static XMTP revocation that works without a live client

Client.revokeInstallations is a static method that doesn't need
a live client — just the signing key, inboxId, and API config.
This works even when Client.create() fails due to the 10-install
limit, which is exactly when you need to revoke.

XMTPInstallationRevoker: static utility that fetches all
installations via Client.inboxStatesForInboxIds, filters out
the current one, and revokes the rest via the static method.

* Broadcast all conversation keys to vault before backup

On a single device, vault key-sharing messages (DeviceKeyBundle)
only exist if multi-device pairing has happened. Without them,
the vault archive is empty of keys and restore can't recover
conversation identities.

Fix: call VaultManager.shareAllKeys() before creating the vault
archive. This broadcasts all conversation keys as a DeviceKeyBundle
message to the vault group, ensuring the vault archive always
contains every key needed for restore — regardless of whether
multi-device pairing has occurred.

This is the intended design: the vault is the single source of
truth for conversation keys, and backups should capture that
truth completely.

* Make vault key broadcast and archive creation non-fatal during backup

After 'delete all data', the vault may not be connected yet.
The backup should still succeed — it just won't include the
vault archive or keys. The conversation archives and GRDB
snapshot are still created.

On the next backup after the vault reconnects, the keys will
be broadcast and the vault archive will be included.

* Add plan for inactive conversation mode post-restore

* Resolve design decisions for inactive conversation mode

* Inactive conversation mode, iCloud vault key sync, and restore reliability fixes (#626)

* Update copy: 'comes online' -> 'sends a message'

* Add inactive conversation mode (Phases 1 & 2)

Phase 1 — data model:
- Add isActive column to ConversationLocalState (default true)
- GRDB migration addIsActiveToConversationLocalState
- Surface isActive: Bool on Conversation model via local state join
- Add setActive(_:for:) and markAllConversationsInactive() to
  ConversationLocalStateWriter + protocol
- RestoreManager calls markAllConversationsInactive() after
  importing conversation archives

Phase 2 — reactivation detection:
- After each syncAllConversations call site in SyncingManager
  (initial start, resume, discovery), query for conversations
  where isActive == false
- For each, call conversation.isActive() on the XMTP SDK
  (local MLS state check, no network)
- If true, update GRDB via setActive(true, for:)
- GRDB ValueObservation propagates to UI reactively

Phases 3 & 4 (UI) are blocked on design sign-off.

* Remove local settings from tracked files

* Stop tracking .claude/settings.local.json (already gitignored)

* Fix infinite recursion in createArchive protocol bridge

The protocol method createArchive(path:encryptionKey:) was calling
itself instead of the SDK's createArchive(path:encryptionKey:opts:)
because the compiler resolved the overload back to the protocol.
Explicitly cast to XMTPiOS.Client, same pattern as importArchive.

* Fix infinite recursion in importArchive protocol bridge

The cast to XMTPiOS.Client didn't help because the SDK method has
the exact same signature as the protocol method — Swift resolved
it back to the protocol bridge. Call through ffiClient directly
to bypass the protocol dispatch entirely.

* Remove archive protocol bridge methods causing infinite recursion

XMTPiOS.Client satisfies both createArchive and importArchive
protocol requirements directly via its SDK implementations.
No explicit bridge needed, no recursion risk.

* Remove createArchive/importArchive from XMTPClientProvider protocol

These methods don't belong on the messaging protocol. All callers
in the backup/restore layer use XMTPiOS.Client directly, not the
protocol abstraction. ConvosBackupArchiveProvider.buildClient now
returns Client instead of any XMTPClientProvider.

No wrapper, no recursion, no redundant protocol surface.

* Add diagnostic logging for inactive conversation reactivation

- Remove try? in RestoreManager markAllConversationsInactive to
  surface any errors
- Add per-conversation debug logging in reactivation check to
  trace isActive() return values
- Helps diagnose whether importArchive causes isActive() to
  return true immediately (defeating the inactive flag)

* Reactivate conversations on inbound message, change join copy

1. Reactivation: move from isActive() polling in SyncingManager to
   message-driven reactivation in StreamProcessor. When an inbound
   message arrives for a conversation with isActive=false, flip it
   to active immediately. This avoids the issue where importArchive
   causes isActive() to return true on stale MLS state.

2. Copy: change 'You joined as <name>' to 'Reconnected' for the
   system message shown when the current user's installation is
   re-added to a group.

* Revert 'Reconnected' copy change

The summary in ConversationUpdate has no restore context — changing
it here would affect all self-join scenarios, not just post-restore.
The copy change belongs in Phase 4 (UI layer) where isActive is
available to distinguish reconnection from first join.

* Show 'Reconnected' for post-restore self-join messages

Thread isReconnection flag through the message pipeline:
- DBMessage.Update: add isReconnection (default false, backward
  compatible with existing JSON)
- StreamProcessor: when an inbound message arrives for an inactive
  conversation, set isReconnection=true on the stored GroupUpdated
  message before reactivating the conversation
- ConversationUpdate: surface isReconnection from DB hydration
- summary: return 'Reconnected' instead of 'You joined as X'
  when isReconnection is true

Normal first-time joins are unaffected.

* Fix isReconnection decoding for existing messages

Add custom Decodable init to DBMessage.Update that defaults
isReconnection to false when the key is missing. Existing
GroupUpdated messages in the DB were stored before this field
existed, causing keyNotFound decoding errors and blank message
history.

* Document quickname backup as deferred for v1

* Implement inactive conversation UI (Phases 3 & 4)

Phase 3 — conversations list:
- Show 'Awaiting' label with dot separator in subtitle when
  conversation.isActive is false (same pattern as 'Verifying')
- Suppress unread/muted indicators for inactive conversations

Phase 4 — conversation detail:
- InactiveConversationBanner: pill overlay above composer with
  clock.badge.checkmark.fill icon, 'History restored' title,
  subtext explaining the state, tappable to learn.convos.org
- Intercept send, reaction, and reply actions when inactive —
  show 'Awaiting reconnection' alert instead of executing
- Disable top bar trailing item and text field when inactive
- All UI clears reactively when isActive flips to true via
  GRDB ValueObservation (no restart needed)

* Fix InactiveConversationBanner layout to match Figma design

* Update inactive banner: cloud.fill icon, 'Restored from backup'

- Change icon from clock.badge.checkmark.fill to cloud.fill
- Change title from 'History restored' to 'Restored from backup'
- Move banner from overlay into bottomBarContent so it participates
  in bottom bar height measurement — messages get proper inset and
  don't render behind the banner

* Use colorLava for cloud icon in inactive banner

* Fix reactivation for messages arriving via sync path

Messages can arrive through two paths:
1. Message stream → processMessage → markReconnectionIfNeeded ✓
2. Sync/discovery → processConversation → storeWithLatestMessages

Only path 1 had reactivation. Add reactivateIfNeeded after
storeWithLatestMessages in processConversation so the first
message (typically the GroupUpdated 'You joined') properly
clears the inactive state.

* Mark reconnection messages in sync path + re-adopt profiles after restore

1. reactivateIfNeeded now marks recent GroupUpdated messages as
   isReconnection=true before flipping isActive, so the 'You joined'
   message shows 'Reconnected' even when arriving via sync path.

2. ConversationWriter: when conversation is inactive (post-restore),
   force-apply member profiles from XMTP group metadata to GRDB,
   bypassing the 'existing profile has data' skip. This re-adopts
   display names and avatars from the group's custom metadata after
   restore, preventing 'Somebody' fallback.

* Show spinner only on active action during restore

Track which action is running via activeAction string. Spinner
only shows next to the active button. Hide the Actions section
entirely during restore so only the restore row with its spinner
is visible.

* Fix first message after restore not appearing in history

After restore, conversationWriter.store() calls conversation.sync()
which can fail on stale MLS state (new installation not yet re-added
to the group). This threw an error that prevented messageWriter.store
from ever running, so the first inbound message was lost.

Fix: if conversationWriter.store() fails, fall back to the existing
DBConversation from GRDB (which exists from the backup) and continue
storing the message. The conversation will be properly synced on the
next successful sync cycle.

Also refactor reactivateIfNeeded to extract markRecentUpdatesAsReconnection
for clarity.

* Try all vault keys on restore + detailed debug view

RestoreManager: try decrypting the backup bundle with every
available vault key (local + iCloud) instead of just the first.
This handles the case where device B created its own vault key
before iCloud sync propagated device A's key.

VaultKeySyncDebugView: add detailed sections showing:
- Each vault key with inboxId, clientId, and which stores
  (local/iCloud) it exists in, with green checkmark for active
- iCloud backup files with device name, date, size, inbox count

Makes it easy to diagnose vault key divergence between devices.

* Enable iCloud Keychain sync for vault keys

The iCloud vault keychain store was using
kSecAttrAccessibleAfterFirstUnlock but never set
kSecAttrSynchronizable — so items stayed local on each device
even under the same Apple ID.

Add synchronizable flag to KeychainIdentityStore and set it to
true for the iCloud vault store. This ensures vault keys
propagate between devices via iCloud Keychain, so device B can
decrypt backups created by device A.

Updated all iCloud vault store creation sites:
- ConvosClient+App.swift (production)
- BackupDebugView.swift
- VaultKeySyncDebugView.swift

* Add per-key actions in vault debug view

Each vault key now shows inline buttons:
- 'Delete local' — removes only the local copy for that inboxId
- 'Delete iCloud' — removes only the iCloud copy for that inboxId
- 'Adopt locally' — shown for iCloud-only keys, copies to local

This lets you surgically delete the wrong local vault key and
adopt the correct iCloud one without affecting other keys.

* Allow vault clientId update after restore or cross-device sync

After restore or iCloud key sync, the vault inboxId is the same
but the XMTP installation ID (clientId) is different because each
device gets its own installation. Previously this threw
clientIdMismatch and failed vault bootstrap.

Fix:
- InboxWriter: for vault inboxes, update clientId in GRDB instead
  of throwing when it changes
- VaultManager: re-save vault key to keychain with new clientId
  when it differs from the stored one (not just for vault-pending)

* Fix restore spinner not showing

runAction title was 'Restore' but actionLabel title was
'Restore from backup' — the mismatch meant activeAction
never matched and the spinner never appeared.

* Skip quickname onboarding for restored conversations

After restore, the conversation already has a profile from XMTP
group metadata. But UserDefaults (quickname state) aren't backed
up, so the onboarding coordinator thinks the user never set their
name and shows 'Add your name for this convo'.

Skip startOnboarding() entirely when conversation.isActive is
false — the profile is already restored from metadata.

* Update tests to include isActive and isReconnection parameters

MessagesListProcessorTests and related files need the new
parameters added in this branch:
- ConversationUpdate.isReconnection
- ConversationLocalState.isActive

* Fix delete all data leaving orphaned conversations (#628)

* Fix delete all data leaving orphaned conversations

InboxWriter.deleteAll() only deleted inbox rows — conversations,
messages, members, profiles, and all other tables were left behind.
No foreign key cascade from inbox to conversation exists, so the
data persisted and reappeared on next launch.

Now deletes all rows from every GRDB table in dependency order
within a single write transaction.

* Address code review: fix FK ordering, add test, add logging

- Fix foreign key ordering: conversation_members before invite
  (invite has FK referencing conversation_members)
- Add per-table error logging for debugging failed deletions
- Add testDeleteAllRemovesAllData: creates full data graph
  (inbox, conversation, member, profile, local state,
  conversation_members), calls deleteAll(), verifies all empty

* Address code review feedback on PR #626

Three fixes flagged by Macroscope and Claude review:

1. iCloud Keychain migration (critical)
   Add migrateToSynchronizableIfNeeded() that finds existing
   non-synchronizable items and re-saves them with
   kSecAttrSynchronizable: true. Without this, existing users
   would lose access to vault keys after enabling the
   synchronizable flag (queries with synchronizable:true don't
   match non-synchronizable items).
   Called at app launch from ConvosClient+App before the
   VaultKeyStore is created.

2. CancellationError propagation in StreamProcessor
   processMessage catch block for conversationWriter.store now
   rethrows CancellationError instead of falling through to the
   existing-DBConversation fallback. Prevents runaway work after
   task cancellation.

3. InboxWriter vault clientId update early-return
   When vault clientId changed, we were returning early and
   skipping the installationId update logic. Now we continue
   through to update installationId if the caller provided one.

* Add plan for vault re-creation on restore

After restore, the vault cannot be reactivated the same way
conversations can — the vault is the user's own group with no
third party to trigger MLS re-addition. This plan proposes
creating a fresh vault (new inboxId, new keys) on restore,
renaming the old vault DB as a read-only artifact, and
broadcasting the restored conversation keys to the new vault
so it becomes the source of truth for future device pairs.

* Add VaultManager.reCreate method for post-restore vault rebuild

Tears down the current vault and bootstraps a fresh one with new
keys. Used after restore to replace the inactive restored vault
with a working one.

Flow:
  1. Disconnect current vault client
  2. Delete old vault key from local + iCloud keychain
  3. Delete old DBInbox row for the vault
  4. Re-run bootstrapVault() which generates fresh keys and
     creates a new MLS group

The old vault XMTP database file is left on disk as an orphan
to avoid filesystem risk. A future cleanup pass can remove
orphaned vault DB files.

* Wire vault re-creation into RestoreManager

After marking conversations inactive, call VaultManager.reCreate
to tear down the restored (read-only) vault and bootstrap a new
one with fresh keys. Then broadcast the restored conversation
keys to the new vault via shareAllKeys so future device pairs
can sync through the new vault.

VaultManager is optional on RestoreManager to keep existing tests
working without a live vault. The backup debug view injects it
from the session.

If vault re-creation fails, the restore still completes — the
user has working conversations, just no functional vault. They
can retry or manually re-create later.

* Add comprehensive logging for vault re-creation flow

Tag-prefixed logs across the full re-creation path to make it
easy to verify on device:

- [Vault.reCreate] === START === / === DONE === boundaries
- Before/after snapshot of inboxId, installationId, bootstrap
  state, keychain key count, and GRDB vault inbox row count
- Step-by-step 1/4 through 4/4 markers
- Verification log comparing old vs new inboxId (warns loudly
  if they match — indicates failed re-creation)
- [Vault.bootstrap] logs XMTP connect, identity transition,
  keychain persist, and GRDB save
- [Vault.loadOrCreateVaultIdentity] logs whether existing key
  was found or fresh keys generated
- [VaultKeyCoordinator.shareAllKeys] logs row count, packaged
  key count, missing identities, and send success
- [Restore.reCreateVault] mirrors the same pattern on the
  restore side

* Update plans: revoke old vault in reCreate + stale device detection

- vault-re-creation: add step 8a to revoke device A's installation
  from the old vault before disconnecting, mirroring the per-
  conversation revocation that already happens. Steps re-lettered
  8a through 8g.
- stale-device-detection: new plan for detecting when the current
  device has been revoked (via client.inboxState refresh from
  network) and showing a 'device replaced' banner with a delete-
  data CTA. Covers detection triggers, storage, and UI.

* Revoke old vault installations before re-creating

Step 1/5 of VaultManager.reCreate now calls
revokeAllOtherInstallations on the old vault before disconnecting,
mirroring the per-conversation revocation that already happens
on restored conversation inboxes.

This kills device A's zombie vault installation so the old vault
is fully decommissioned (not just orphaned). Combined with the
conversation-level revocations, device A is now completely
replaced after restore.

Failure is non-fatal — logged as a warning, re-creation
continues. Worst case, device A keeps a zombie vault installation,
but the new vault on device B still works.

* Add isStale column to DBInbox with migration

Foundation for stale device detection: a per-inbox flag
indicating the current device's installation has been revoked
(e.g., because the user restored on a different device).
Defaults to false, nullable migration for existing rows.

* Add stale installation detection in InboxStateMachine

After inbox reaches ready state, call client.isInstallationActive
which queries XMTP inboxState(refreshFromNetwork: true) and checks
if our installationId is still in the active installations list.

If not → device was revoked (typically because user restored on
another device) → mark the inbox as stale in GRDB, emit QA event.

New XMTPClientProvider method isInstallationActive() wraps the
SDK call and returns Bool for clarity.

Added 'inbox' category to QAEvent for the stale_detected event.

* Also check stale installation on app foreground

When the app returns to foreground, inboxes might have been
revoked on another device (e.g., user restored on device B
while device A was backgrounded). Re-run isInstallationActive
check in handleEnterForeground to catch this on return.

* Surface isStale on Inbox model + reactive publisher

- Add isStale to Inbox domain model and DBInbox hydration
- Add InboxesRepository.anyInboxStalePublisher that emits true
  when any non-vault inbox has isStale = true
- Subscribe in ConversationsViewModel, expose isDeviceStale
  property for the view layer to bind to

* Add StaleDeviceBanner UI + hide stale conversations

- New StaleDeviceBanner view: exclamation icon, title, explanation,
  destructive 'Delete data and start fresh' button (wires into
  existing deleteAllData flow)
- Banner inserted above conversations list via safeAreaInset(edge:
  .top) when viewModel.isDeviceStale is true
- ConversationsViewModel: subscribe to staleInboxIdsPublisher and
  filter out any conversations belonging to stale inboxes, both
  on initial load and on reactive updates from GRDB
- Inline the isInstallationActive check in handleStart /
  handleEnterForeground to avoid Sendable issues passing
  any XMTPClientProvider across actor boundaries

* Fix synchronizable migration data loss risk

Macroscope flagged the migration as losing data if SecItemAdd
fails after the original item is deleted. Reverse the order:
add the synchronizable copy first, then delete the
non-synchronizable original — matching the pattern in
migrateItemToPlainAccessibility. This ensures the data is
never left unrecoverable.

If add fails: original is intact, retry next launch.
If delete fails after successful add: synchronizable copy is
in place, retry next launch (idempotent — the second add will
succeed with errSecDuplicateItem).

* Fail VaultManager.reCreate when old key deletion fails or inboxId matches

Two related fixes flagged by Macroscope:

1. If vaultKeyStore.delete(inboxId:) throws during step 3/5,
   we previously logged a warning and continued. The next
   bootstrap then found the old identity and reused it,
   producing the same inboxId — meaning no new vault was
   actually created. Now throws VaultReCreateError.bootstrapFailed.

2. If after bootstrap the new inboxId matches the old one
   (because deletion didn't actually remove the key), throw
   instead of just logging a warning. The function previously
   returned successfully despite re-creation having failed.

Both errors propagate up to RestoreManager.reCreateVault which
already catches and logs without aborting the restore — the
user can still see their conversations, just with no functional
vault, and can retry.

* decryptBundle: don't let staging dir reset failure escape the loop

If FileManager.createDirectory throws (e.g. disk full) between
key attempts, the error escaped the for loop and we surfaced
that confusing error instead of RestoreError.decryptionFailed.
Catch and log so the loop continues to the next key.

* shareAllKeys: only mark successfully shared inboxes

When identityStore.identity(for:) failed for an inbox, it was
skipped from the keys array but still marked sharedToVault=1
in the database. This caused inboxes whose keys were never
actually shared to be flagged as shared, preventing future
sync attempts.

Now we mark only the inboxIds that ended up in the keys array.

* Clear conversation selection when its inbox becomes stale

When staleInboxIdsPublisher fires and removes conversations from
the visible list, clear _selectedConversationId if the previously
selected conversation is no longer present. Otherwise the detail
view shows a conversation that no longer appears in the list.

Also fix the existing conversationsPublisher selection check to
inspect the filtered list (self.conversations) rather than the
unfiltered parameter, so stale selections are properly cleared
on either trigger.

* Fix vault re-create orphan rows + add inactive mode tests

Macroscope flagged a real bug at VaultManager:317: when oldInboxId
is nil, step 3 deletes all vault keys from the keychain but step 4
skipped DBInbox cleanup. The next bootstrapVault would fail with
InboxWriterError.duplicateVault. Now step 4 deletes all vault rows
when oldInboxId is nil, matching the keychain cleanup behavior.
Also escalated step 4 errors to throw VaultReCreateError.

Test coverage for previously-untested inactive mode logic:
- ConversationLocalStateWriterTests (5 tests):
  setActive toggle, scoped to one conversation, throws on
  unknown conversation, markAllConversationsInactive bulk
  flip, no-op on empty database
- InboxWriterTests adds 2 tests for markStale (toggle + scoping)
- DBMessageUpdateDecodingTests (4 tests):
  legacy JSON without isReconnection defaults to false,
  decode with isReconnection=true preserves value, encode
  round-trip, memberwise init default

46/46 tests pass in the affected suites.

* Refine stale-device UX and update recovery policy plan

* Add StaleDeviceState enum + reactive publisher

Derives healthy/partialStale/fullStale from the set of used
non-vault inboxes and how many are flagged isStale=true:
- healthy: no stale inboxes (or no used inboxes)
- partialStale: some but not all used inboxes are stale
- fullStale: every used inbox is stale (device effectively dead)

Drives the partial-vs-full UX policy described in
docs/plans/stale-device-detection.md. Existing
anyInboxStalePublisher and staleInboxIdsPublisher are kept
for backward compatibility.

* ConversationsViewModel: derive partial vs full stale state

Replace single isDeviceStale boolean with StaleDeviceState enum
driving:
- isDeviceStale: any inbox stale (banner visibility)
- isFullStale: every inbox stale
- canStartOrJoinConversations: false only when fullStale (was
  blocking new convos in any stale state previously — now allows
  them in healthy inboxes during partial stale)

Add handleStaleStateTransition that logs every transition and
schedules an auto-reset on healthy/partial → fullStale via
isPendingFullStaleAutoReset. The view layer binds to this flag
to render a cancellable countdown UI.

Add cancelFullStaleAutoReset / confirmFullStaleAutoReset for the
UI to call from the countdown.

* Stale device UI: partial vs full copy, countdown auto-reset

StaleDeviceBanner now takes a StaleDeviceState and renders
distinct copy + title for partial vs full:
- Partial: 'Some conversations moved to another device' / 'Some
  inboxes have been restored on another device. Resetting will
  clear local data...'
- Full: 'This device has been replaced' / 'Your account has been
  restored on another device. Resetting will clear local data...'

Action button renamed from 'Delete data and start fresh' to
'Reset device' so the destructive intent is in the verb itself
(per plan: 'Continue' was misleading).

StaleDeviceInfoView (Learn more sheet) now also varies copy by
state with two distinct title + paragraph sets.

New FullStaleAutoResetCountdown component shown via
fullScreenCover when ConversationsViewModel.isPendingFullStaleAutoReset
fires. 5-second countdown with Cancel and 'Reset now' actions.
The countdown task auto-fires onReset when remaining hits 0.

Wired into ConversationsView so when staleDeviceState transitions
to fullStale, the cover automatically presents.

* Add tests for StaleDeviceState derivation

8 tests covering the partial-vs-full state matrix:
- healthy when no inboxes
- healthy when inbox has no conversations
- healthy when used inbox is not stale
- partialStale when one of two used inboxes is stale
- fullStale when every used inbox is stale
- vault inbox is excluded from the count
- unused conversations don't make an inbox 'used'
- StaleDeviceState convenience flags (hasUsableInboxes,
  hasAnyStaleInboxes)

54/54 tests pass in affected suites.

* Vault bootstrap: log when installation is stale (diagnostic only)

After bootstrap, query inboxState(refreshFromNetwork: true) and
log a warning + QAEvent if our installationId isn't in the active
list. This means another device restored from backup and revoked
this vault installation.

The conversation-level stale detection (InboxStateMachine) is
what drives user-facing recovery — when a device is fully
replaced, all conversation inboxes go stale and the auto-reset
fires before vault state matters in practice.

This log is purely for diagnosing rare partial-revocation states
during QA. We don't add user-facing UI for vault staleness
because:
- in full-stale, conversation-level reset wipes the device anyway
- in partial-stale, the user isn't trying to use vault-dependent
  features in normal flows
- the only user-visible impact is 'pair new device' silently
  failing, which is rare and easily explained

* Don't dismiss compose flow on partial stale inbox updates

The staleInboxIdsPublisher sink was unconditionally nilling
newConversationViewModel whenever isDeviceStale was true, which
returns true for both partialStale and fullStale states. This
contradicted handleStaleStateTransition which explicitly keeps
the compose flow alive during partial stale so the user can
finish creating a conversation in a still-healthy inbox.

Remove the redundant check entirely — handleStaleStateTransition
is the single source of truth for closing in-flight flows on
transition to fullStale. The sink now only handles visible-list
filtering and selection clearing.

* Address Macroscope feedback on PR #618

Two low-severity findings on the restore flow:

1. VaultKeyStore.deleteLocal: the non-iCloud fallback called
   store.delete(inboxId:) which permanently deletes the key,
   contradicting the documented purpose of 'preserving the
   iCloud copy'. In production we always use ICloudIdentityStore
   so this only mattered in tests/mocks, but the contract was
   wrong. Now it's a no-op with a warning when the underlying
   store isn't iCloud-backed; tests that need to verify deletion
   should call delete(inboxId:) directly.

2. MockDatabaseManager.replaceDatabase didn't run migrations
   after restoring the database, unlike DatabaseManager which
   calls SharedDatabaseMigrator.shared.migrate after restore.
   This caused tests to pass while production could fail on
   schema mismatches. Now runs the migrator post-restore to
   match production behavior.

* BackupBundle: harden path traversal check against symlinks

Macroscope flagged that the existing standardizedFileURL check
doesn't resolve symlinks, but fileData.write(to:) follows them.
A pre-existing symlink under the staging dir could escape the
first-pass check.

Fix: after creating the parent directory, re-validate with
.resolvingSymlinksInPath() on the parent. If the resolved path
is no longer inside the staging dir (because a symlink redirected
it), throw 'path traversal via symlink'.

* RestoreManager: throw on missing vault archive

Macroscope flagged that importVaultArchive returned an empty
array when the vault archive was missing from the bundle, but
restoreFromBackup would still continue to wipe local XMTP state
and replace the database — silent data loss with no keys to
decrypt the resulting conversations.

The RestoreError.missingVaultArchive case was already defined
but never thrown. Throw it now before any destructive operation.

* RestoreManager: stop logging success after a try? swallowed an error

wipeLocalXMTPState was logging 'cleared conversation keychain
identities' unconditionally after try? await identityStore.deleteAll().
The success message appeared even when deleteAll threw and no
identities were actually cleared, hiding restore failures during
debugging.

Convert to do/catch and log a warning on failure (non-fatal —
restore continues either way).

* DatabaseManager: surface rollback failure as a dedicated error

Macroscope flagged that if the restore failed AND the rollback
also failed (e.g. migration threw on the rolled-back snapshot),
the catch block swallowed the rollback error and re-threw only
the original. Caller had no way to know the DB was now in a
potentially-inconsistent state.

Add DatabaseManagerError.rollbackFailed wrapping both the
original and rollback errors. Throw it instead of the original
when the rollback path itself throws — caller can distinguish
'restore failed cleanly' from 'restore failed and we couldn't
recover'.

* Update test for missing-vault-archive: now expects abort

The 'missing vault archive is non-fatal' test was asserting the
old behavior. Updated to assert the restore aborts before any
destructive operation runs and the conversation archive importer
is never invoked.

* Inbox: decode isStale with backward-compatible fallback

Macroscope and Claude flagged that adding isStale as a non-optional
stored property broke Codable backwards compatibility — Swift's
synthesized Decodable ignores default init parameter values, so any
JSON payload missing 'isStale' would fail to decode with keyNotFound.

Inbox is currently only hydrated from GRDB columns (not JSON), so
there's no active bug, but the Codable conformance is a footgun for
any future JSON path (cache, push payload, etc).

Add a custom init(from:) that uses decodeIfPresent for isStale with
a 'false' fallback. No API change.

* VaultManager.bootstrap: save-first-then-delete on keychain update

Macroscope and Claude flagged the delete-then-save-with-try? pattern
when the vault identity is being persisted/updated. If the delete
succeeded but the save failed (keychain access denied, etc), the
vault key was permanently lost — the next bootstrap would generate
a new inboxId and InboxWriter.save would throw duplicateVault
because the old GRDB row still existed.

Fix: save the new/updated entry first (which will throw on failure
and abort the bootstrap with the old key still intact), then delete
the old entry only if save succeeded. Delete failures are now
non-fatal because the vault is already recoverable from the new
entry — they just leave an orphan keychain item that gets cleaned
up on the next cycle.

Mirrors the same add-first-then-delete pattern we use for the
iCloud Keychain synchronizable migration.

* ConversationsViewModel: recompute visible list from unfiltered source

Macroscope and Claude flagged a real UX bug: when an inbox recovers
from stale back to healthy, its conversations remained permanently
hidden. The staleInboxIdsPublisher sink was filtering
self.conversations in-place with
  self.conversations = self.conversations.filter { ... }
which destroyed the source list. When a smaller set of stale IDs
later emitted, the filter ran against an already-filtered list and
the recovered conversations never reappeared. conversationsPublisher
observes a different table and only re-emits on DB change, so
nothing triggered re-hydration.

Fix: cache the publisher's output in unfilteredConversations and
derive the visible list via recomputeVisibleConversations(), called
from both the stale-ids sink and the conversations sink. Also call
it after hiddenConversationIds.remove so post-explosion rollback
behaves correctly.

* StreamProcessor: handle CancellationError explicitly in processMessage

Macroscope flagged that 'throw CancellationError()' inside the inner
do/catch for conversationWriter.store was immediately caught by the
outer generic 'catch {}' and logged as an error rather than
propagated — defeating cooperative cancellation.

Since processMessage is declared 'async' (not 'async throws') and
changing the signature would ripple across the StreamProcessor
protocol and six call sites, handle it locally: add a dedicated
'catch is CancellationError' arm to both do/catch blocks that logs
at debug level and returns early.

The enclosing runMessageStream loop calls Task.checkCancellation()
at the top of every iteration, so cancellation still exits the
stream task — just one message later than ideal, which is
acceptable for cooperative cancellation semantics.

* ConversationsViewModel: seed unfilteredConversations and fix explode ordering

Two follow-up bugs from the unfilteredConversations refactor that
Macroscope caught:

1. (Medium) init populated self.conversations from fetchAll() but
   left unfilteredConversations empty. When observe() subscribed to
   staleInboxIdsPublisher, GRDB's ValueObservation fired immediately,
   triggering recomputeVisibleConversations against an empty source
   and wiping the initial list. Seed both lists from the fetch.

2. (Low) explodeConversation was mutating self.conversations in
   place with removeAll, bypassing the unfiltered cache, and on
   success the conversation would briefly resurface because
   recomputeVisibleConversations filtered an unfiltered list that
   still contained it.

   Rework to the optimistic-hide pattern: insert into
   hiddenConversationIds and recompute (hides immediately), then on
   success drop from unfilteredConversations BEFORE clearing the
   hide marker (so the recompute doesn't resurface it), or on
   failure just clear the hide marker to bring it back.
lourou added a commit that referenced this pull request Apr 13, 2026
* Add backup bundle creation with AES-GCM encryption

Implements the backup orchestrator (PR 3 from icloud-backup plan):

BackupManager: actor that coordinates the full backup flow
  - creates vault XMTP archive via BackupArchiveProvider protocol
  - creates per-conversation XMTP archives (failure per-conversation
    is non-fatal, logged as warning)
  - copies GRDB database via GRDB's backup API (consistent snapshot)
  - writes metadata.json (version, device, timestamp, inbox count)
  - packages everything into a tar, encrypts with AES-256-GCM using
    the vault key's databaseKey
  - writes to iCloud Drive (Documents/backups/<device-id>/) with
    local fallback when iCloud container is unavailable

BackupBundle: staging directory layout, tar/untar, pack/unpack
BackupBundleCrypto: AES-256-GCM encrypt/decrypt with 32-byte key
BackupBundleMetadata: versioned JSON metadata (v1)
BackupArchiveProvider: protocol for vault + conversation archiving,
  decoupled from VaultManager/XMTP for testability

16 tests across 4 suites:
  - metadata JSON round-trip
  - AES-GCM encrypt/decrypt, wrong key, invalid key, empty/large data
  - tar/untar round-trip, encrypted pack/unpack, wrong key rejection
  - full backup with mock archives, partial failure resilience,
    bundle decryptability, GRDB database integrity, empty inbox list

* Harden archive format against path traversal and symlink escape

Code review fixes:
- untar: validate resolved path stays within destination directory,
  reject entries with .. traversal
- tar: validate resolved file stays within source directory, reject
  symlinks pointing outside
- crypto: package access control, preserve underlying error context
  instead of losing it via localizedDescription
- backup manager: log individual conversation errors with inboxId
  instead of just the count
- add test for path traversal rejection in untar

* Add concrete BackupArchiveProvider and wire XMTP archive APIs

The backup bundle was using a protocol with only a mock
implementation. This adds the real wiring:

- Add createArchive/importArchive to XMTPClientProvider protocol
  with explicit implementations in the XMTPiOS.Client extension
  (SDK method has an extra opts parameter with default value)
- Add ConvosBackupArchiveProvider: concrete implementation that
  delegates vault archiving to VaultServiceProtocol and builds
  temporary XMTP clients via Client.build() (local-only, no
  network) for per-conversation archiving
- Update MockXMTPClientProvider and SyncingManagerTests mock
  with the new protocol methods

* Write metadata.json alongside encrypted bundle for restore detection

PR 4's restore flow needs to inspect backup metadata (timestamp,
device name, inbox count) without decrypting the bundle first.
Write an unencrypted metadata.json next to backup-latest.encrypted
in the backup directory. The same metadata is also inside the
encrypted bundle for integrity.

Matches the plan's iCloud Drive layout:
  backups/<device-id>/metadata.json
  backups/<device-id>/backup-latest.encrypted

Add BackupBundleMetadata.exists(in:) helper for restore detection.
Test verifies sidecar metadata is written with correct values.

* Add iCloud container entitlements and dynamic container identifier

Register iCloud Documents capability per environment:
- Local:  iCloud.org.convos.ios-local
- Dev:    iCloud.org.convos.ios-preview
- Prod:   iCloud.org.convos.ios

Add ICLOUD_CONTAINER_IDENTIFIER to all xcconfig files, derived
from MAIN_BUNDLE_ID (same pattern as APP_GROUP_IDENTIFIER).

Add icloud-container-identifiers and ubiquity-container-identifiers
to Convos.entitlements using the build variable.

Add AppEnvironment.iCloudContainerIdentifier computed property
(derived from appGroupIdentifier) and pass it explicitly to
FileManager.url(forUbiquityContainerIdentifier:) instead of nil.

* Fix integer overflow crash and double error wrapping

- untar: validate UInt64 file length fits in Int before conversion,
  preventing a trap on malicious data with length >= 2^63
- crypto: move guard outside do-catch so the encryptionFailed case
  from the guard is not re-wrapped by the catch block

* Add backup debug panel for manual backup creation

New debug view next to Vault Key Sync in the debug settings:
- shows iCloud container availability, last backup metadata
  (timestamp, device name, inbox count, bundle version)
- 'Create backup now' button triggers a full backup
- wires BackupManager with ConvosBackupArchiveProvider,
  VaultKeyStore (iCloud dual-store), and identity store
- reads sidecar metadata.json for status without decrypting

* Fix debug view actor isolation and metadata access control

- move makeBackupManager() call outside the @Sendable closure
  to avoid main-actor isolation violation
- make BackupBundleMetadata.read(from:) public so the main app
  target can read sidecar metadata

* Remove deprecated shouldUseExtendedBackgroundIdleMode

Deprecated in iOS 18.4 and marked as not supported. The background
URLSession configuration already handles extended background
behavior via sessionSendsLaunchEvents and isDiscretionary.

* Fix debug view missing local fallback for backup directory

The debug view only checked iCloud and returned nil when
unavailable, missing backups saved locally. Now mirrors
BackupManager's fallback: iCloud first, then local directory.
Made defaultDatabasesDirectoryURL public for main app access.

* Separate iCloud availability from backup directory resolution

iCloud container status is now checked independently so the debug
view correctly shows 'Unavailable' even when a local backup exists.

* Add restore flow: decrypt bundle, import vault, replace DB (#618)

* Add restore flow: decrypt bundle, import vault, replace DB, import archives

RestoreManager actor orchestrates the full restore:
  1. Load vault key from VaultKeyStore (iCloud fallback)
  2. Decrypt and unpack backup bundle
  3. Import vault archive via VaultServiceProtocol → [VaultKeyEntry]
  4. Save each key entry to local keychain
  5. Replace GRDB database (close pool → swap file → reopen + migrate)
  6. Import per-conversation XMTP archives (failure per-inbox is non-fatal)

RestoreState enum tracks progress through each phase for UI binding.

Restore detection:
  - findAvailableBackup() scans iCloud Drive + local fallback for
    sidecar metadata.json, returns newest backup across all devices
  - Combined with ICloudIdentityStore.hasICloudOnlyKeys() for full
    restore detection (PR 6 UI)

DatabaseManager changes:
  - dbPool changed from let to private(set) var
  - replaceDatabase(with:) closes pool, removes DB + WAL/SHM files,
    copies backup, reopens with migrations
  - DatabaseManagerProtocol gains replaceDatabase requirement
  - MockDatabaseManager implements via erase + backup-to for tests

ConvosRestoreArchiveImporter: concrete implementation builds
temporary XMTP client via Client.build() per inbox and imports

7 tests: full flow, DB replacement, partial failure, missing vault
archive, restore detection, nil detection, state progression

* Make DatabaseManager.replaceDatabase safe with rollback

Validate backup opens before closing the active pool. Move the
current DB aside as a safety net. If copy or reopen fails, restore
the original database so the app is never left with no usable DB.

* Validate backup before erasing MockDatabaseManager

Stage backup into a temp queue first. Only erase the active
in-memory DB after confirming the backup is valid and copyable.

* Fix findAvailableBackup to check local when iCloud is empty

iCloud container available but with no backup was returning nil
instead of falling through to the local directory. Chain the
checks: try iCloud first, fall back to local if not found.

* Track failed key saves in RestoreState.completed

Add failedKeyCount to the completed state so the UI can warn
the user about conversations that may not be decryptable.
saveKeysToKeychain now returns the failure count instead of
throwing.

* Add restore button to backup debug panel

Shows available backup metadata (timestamp, device, inbox count)
and a destructive 'Restore from backup' button with confirmation
dialog. Restore section only visible when databaseManager is
provided (optional param). Reports failedKeyCount on completion.

Thread databaseManager through DebugExportView → DebugViewSection
→ BackupDebugView as an optional. Expose ConvosClient.databaseManager
as public for debug wiring.

* Wire databaseManager through to backup debug panel for restore

Thread databaseManager from ConvosApp → ConversationsViewModel →
AppSettingsView → DebugExportView → DebugViewSection →
BackupDebugView so the restore button can call
DatabaseManager.replaceDatabase().

All params are optional so previews and mocks still compile
without it.

* Fix false path traversal rejection on iOS devices

The path traversal check compared resolved paths using hasPrefix
with a hardcoded '/' separator, but on iOS devices directory URLs
created with isDirectory:true can have a trailing slash in their
resolved path. This caused 'resolvedDirPath + "/"' to become
'.../uuid//' which never matched, rejecting all files including
legitimate ones like database.sqlite.

Extract a resolvedPath() helper that strips trailing slashes after
resolving symlinks. Use it consistently in both tar and untar.

* Fix untar path resolution inconsistency on iOS devices

resolvingSymlinksInPath() behaves differently for existing
directories vs non-existent files on iOS: the directory resolves
/private/var → /var but the file keeps /private/var since it
doesn't exist on disk yet. This caused the prefix check to fail.

Build the file path from the already-resolved directory path
instead of resolving it independently. Use standardizedFileURL
(which handles .. without touching the filesystem) instead of
resolvingSymlinksInPath for the file component.

* Fix vault backup restore after local wipe and stabilize DB restore

Restore had two independent failure modes in the backup -> restore flow.

First, deleting all app data called VaultManager.unpairSelf(), which cleaned up the local vault state by deleting the vault key through VaultKeyStore.delete(inboxId:). When the underlying store was ICloudIdentityStore, that removed both the local and iCloud keychain copies. Any existing backup bundle had been encrypted with the old vault database key, so a subsequent app relaunch generated a fresh vault identity and restore then failed with AES-GCM authenticationFailure because the app was attempting to decrypt the bundle with a different key.

Second, RestoreManager replaced the GRDB database while the app still held long-lived readers, writers, inbox services, vault activity, and background restore-adjacent tasks. DatabaseManager.replaceDatabase(with:) also recreated the DatabasePool instance entirely. That meant restore was vulnerable to crashing or destabilizing the app because live services and captured dbReader/dbWriter references could become stale during replacement.

This change addresses both problems.

Vault key handling
- add ICloudIdentityStore.deleteLocalOnly(inboxId:) so restore-related local cleanup can preserve the iCloud backup key
- add VaultKeyStore.deleteLocal(inboxId:) as the vault-facing API for that behavior
- keep iCloud key preservation scoped only to the local delete-all-data/unpair path
- keep remote self-removal strict by continuing to delete both local and iCloud copies when this device is removed from the vault by another device

Restore lifecycle coordination
- add RestoreLifecycleControlling to let RestoreManager explicitly quiesce and resume runtime state around destructive restore work
- wire BackupDebugView to pass the current session as the restore lifecycle controller
- make SessionManager conform by pausing the import drainer, sleeping inbox checker, inbox lifecycle, and vault activity before database replacement, then resuming restore-safe pieces and notifying GRDB observers afterward

Database replacement safety
- change DatabaseManager.replaceDatabase(with:) to preserve the existing DatabasePool instead of closing it and constructing a new pool
- copy the current database into an in-memory rollback queue before replacement
- restore the backup contents into the existing pool, then rerun migrations
- on failure, restore the rollback contents back into the same pool and rerun migrations
- add an explicit missing-file guard before attempting replacement

Tests
- add VaultKeyStore coverage proving deleteLocal preserves the iCloud vault copy and full delete still removes both copies
- add RestoreManager coverage proving restore lifecycle hooks are called around restore
- add DatabaseManager restore coverage proving captured readers remain valid after database replacement

Validation
- swift test --package-path ConvosCore --filter '(VaultKeyStoreTests|RestoreManagerTests|DatabaseManagerRestoreTests)'
- xcodebuild build -project Convos.xcodeproj -scheme 'Convos (Dev)' -destination 'id=00008110-001C51283AC2801E' -derivedDataPath .derivedData

Result
- backup decryption can continue to use the original vault key after delete-all-data / app relaunch
- destructive restore now pauses active app state before swapping database contents
- existing GRDB handles remain usable after restore instead of silently pointing at a retired pool

* Tighten restore barriers and exclude unused inboxes from backup counts

This follow-up hardens the restore path and fixes backup metadata/archive counts that were inflated by the pre-created unused conversation inbox.

Restore hardening
- wrap destructive database replacement work in DatabasePool.barrierWriteWithoutTransaction so the restore waits for in-flight GRDB accesses on the live pool to drain before copying rollback data and before copying restored data into place
- keep the existing in-place DatabasePool replacement strategy, but make the source->destination copy and rollback copy happen behind an explicit GRDB barrier
- keep migrations after replacement so schema is still reopened at the current app version
- keep rollback-on-failure behavior intact

Session restore follow-up
- when prepareForRestore runs, clear the cancelled unused inbox preparation task reference
- when finishRestore runs, schedule unused inbox preparation again so the post-restore app returns to the same steady-state behavior as a normal session

Backup count / archive fix
- add InboxesRepository.nonVaultUsedInboxes()
- this returns only non-vault inboxes that have at least one non-unused conversation row
- BackupManager now uses that query when deciding which per-conversation archives to create
- RestoreManager now uses the same query for its completed inbox count

This keeps the vault backup behavior unchanged and only excludes the synthetic unused conversation inbox from the non-vault conversation archive path.

Important: the vault is still always backed up.
- BackupManager still creates the vault archive first
- the overall backup bundle is still encrypted with the vault databaseKey
- restore still depends on the vault archive plus the vault key to recover conversation keys

What changes from the user-visible perspective
- backup metadata inboxCount now matches real visible/restorable conversations instead of counting the hidden unused placeholder inbox
- restore completed counts now match that same definition
- the extra placeholder inbox is no longer archived as if it were a real conversation
- the vault remains part of every backup exactly as before

Tests added/updated
- backup tests now seed real conversation rows for active inboxes
- add coverage that unused conversation inboxes are excluded from backup metadata and archive generation
- add coverage that restore completed counts exclude unused conversation inboxes
- existing restore/database tests continue to cover lifecycle hooks and stable DB handle behavior

Validation
- swift test --package-path ConvosCore --filter '(BackupManagerTests|RestoreManagerTests|VaultKeyStoreTests|DatabaseManagerRestoreTests)'
- xcodebuild build -project Convos.xcodeproj -scheme 'Convos (Dev)' -destination 'id=00008110-001C51283AC2801E' -derivedDataPath .derivedData

* Fix vault archive import crash by using temporary XMTP client

The live VaultClient's DB connection was in an inconsistent state
after pause() (dropLocalDatabaseConnection), causing importArchive
to crash in the XMTP FFI layer.

Replace the live vaultService.importArchive call with a temporary
XMTP client built from the vault key. This matches the real restore
scenario (fresh device, no running vault) and avoids conflicts with
the live vault's connection state.

- Extract VaultArchiveImporter protocol for testability
- ConvosVaultArchiveImporter: builds temp client, imports archive,
  extracts keys from vault messages, tears down
- Remove vaultService dependency from RestoreManager entirely
- Revert barrierWriteWithoutTransaction (deadlock) back to plain
  backup(to:) for DB replacement
- Update tests with MockVaultArchiveImporter

* Fall back to Client.create when Client.build fails during restore

On a fresh device (real restore), no local XMTP database exists,
so Client.build fails. Fall back to Client.create with the signing
key, same pattern as InboxStateMachine.handleAuthorize.

Also add vault-specific codecs to the temporary client options so
vault messages can be decoded for key extraction.

Applied to both ConvosVaultArchiveImporter and
ConvosRestoreArchiveImporter.

* Extract keys from restored GRDB instead of live vault import

The XMTP SDK's importArchive crashes when called on a client
whose local database already has state (same-device scenario) or
when called on a just-created client. The vault messages are
already in the backup's GRDB database, so we don't need the XMTP
archive import to extract keys.

New restore order:
1. Replace GRDB database with backup copy
2. Extract keys from vault messages in the restored DB
3. Save keys to keychain
4. Import vault XMTP archive (best-effort, non-fatal)
5. Import conversation XMTP archives

The vault XMTP archive import is now best-effort — if it fails
(e.g. on same-device test), the restore still succeeds because
key extraction comes from GRDB, not from the XMTP archive.

* Skip vault XMTP archive import to avoid FFI crash

The XMTP SDK's importArchive causes a Rust panic (uncatchable from
Swift) when importing into a client that already has state. Since
key extraction now reads from the restored GRDB database, the vault
XMTP archive import is unnecessary. The vault will rebuild its XMTP
state when it reconnects on next launch.

* Extract keys from live vault before teardown, not after

Vault messages (DeviceKeyBundle/DeviceKeyShare) live in the XMTP
database, not GRDB. Restoring the GRDB database doesn't restore
vault messages. A fresh temporary XMTP client can't access
historical vault group messages either (different installation).

Fix: extract keys from the LIVE connected vault client while it's
still running, BEFORE the lifecycle teardown. The vault already
has all the messages in its local XMTP database.

New restore order:
1. Decrypt bundle
2. Extract keys from LIVE vault (VaultManager.extractKeys)
3. Stop sessions (prepareForRestore)
4. Replace GRDB database
5. Save extracted keys to keychain
6. Import conversation XMTP archives
7. Resume sessions (finishRestore)

* Use XMTP archive import as designed — no DB swaps or live extraction

The XMTP SDK's importArchive is additive and designed for
backup/restore across installations. Use it correctly:

Restore flow:
1. Decrypt bundle
2. Stop all sessions
3. Import vault archive: Client.create(signingKey) then
   importArchive(path, key) — creates a fresh XMTP client and
   imports vault message history including all key bundles/shares
4. Extract keys from imported vault messages
5. Save keys to keychain
6. Replace GRDB database
7. Import conversation archives: Client.create(signingKey) then
   importArchive(path, key) per conversation
8. Resume sessions

Key fixes:
- Vault importer uses NO dbDirectory (matches VaultManager's
  default = app Documents), was wrongly using app group container
- Conversation importer uses defaultDatabasesDirectory (matches
  InboxStateMachine's app group container)
- Both use Client.create (not build) since we want fresh
  installations — importArchive is additive
- Removed vaultService dependency from RestoreManager entirely
- After import, conversations start inactive/read-only per MLS
  and reactivate automatically as other members come online

* Use isolated temp directory for vault archive import

importArchive crashes (Rust panic) when the vault's XMTP database
already exists on disk with the same data. Create a fresh temporary
directory for the import client so there's no conflict with the
live vault's database files.

This works for both same-device testing (existing DB) and real
restore (fresh device, no DB). The temp directory is cleaned up
after key extraction.

* Disable device sync workers for archive import clients

The device sync worker was running concurrently with importArchive,
causing a race condition in the Rust FFI layer that crashed the app.
Archive import clients are temporary — they only need to import
data and extract keys, not participate in device sync.

* Add import success log for vault archive

* Skip vault archive import when conversation keys already in keychain

Vault importArchive crashes when called on a device that already has
a live installation for the same inboxId. More importantly, it's
unnecessary: if conversation keys are already in the local keychain,
the vault archive contains nothing we don't have.

Before attempting vault archive import, check which conversation
inboxIds from the bundle are missing from the keychain:

  Same device (all keys present):
    → skip vault archive import entirely
    → restore GRDB + import conversation archives only

  Fresh device or empty vault (keys missing):
    → import vault archive to recover missing keys
    → save extracted keys to keychain
    → restore GRDB + import conversation archives

This correctly handles all scenarios including a device where the
vault bootstrapped from iCloud Keychain but its XMTP database is
empty (no DeviceKeyBundle/Share messages yet).

* Skip conversation archive import when XMTP DB already exists

importArchive is additive — it restores message history into a
fresh installation. If the XMTP DB already exists (same device),
Client.build() succeeds and there is nothing to import.

Same-device: build() succeeds → skip importArchive, history is there
Fresh device: build() fails → create() + importArchive()

This mirrors the vault import logic and avoids the FFI crash from
importing into an existing installation.

* Skip vault importArchive when vault XMTP DB already exists

Same pattern as conversation archives: try Client.build() first.
If vault DB exists on disk, extract keys directly from the
existing vault database instead of importing the archive.
Only import the archive on a fresh device where build() fails.

This fixes the intermittent crash that occurred when the GRDB
restore caused some conversation keys to appear missing from
the keychain, triggering vault importArchive on same-device.

* Wipe local XMTP state before restore for clean deterministic behavior

Restore is destructive: the device should look exactly like the
backup, nothing more, nothing less. Partial state from the current
device (deleted conversations, orphaned DBs, mismatched keys)
was causing intermittent crashes and inconsistent behavior.

Before importing vault archive:
- identityStore.deleteAll() removes all conversation keychain keys
- Delete all xmtp-* files from AppGroup + Documents directories

After wipe, Client.build() fails for all inboxes (no local DB),
so both vault and conversation importers always take the
create() + importArchive() path — consistent regardless of
device state.

The vault key (vaultKeyStore) is preserved since it lives in a
separate keychain service and is needed to decrypt the bundle.

* Extract vault keys before wipe to avoid broken state on failure

If vault archive import fails after the wipe, the device is left
with no keys and no restored data — unrecoverable. Extract keys
first while the vault is still intact, then wipe once we have
something to restore.

New order:
1. Stop sessions
2. Import vault archive + extract keys (vault DB still present)
3. Wipe XMTP DBs + conversation keychain identities
4. Save extracted keys to keychain
5. Replace GRDB
6. Import conversation archives (fresh DBs)
7. Resume sessions

* Don't wipe vault XMTP DB during restore + add revoke debug button

Don't wipe vault XMTP DB (Documents/):
  Only conversation DBs (AppGroup/) need wiping. The vault DB is
  preserved since it was just used to extract keys, and wiping it
  forces Client.create() which registers a new installation —
  hitting the 10-installation limit during repeated testing.

Add revoke button to debug panel:
  'Revoke all other vault installations' clears accumulated test
  installations from the XMTP network, freeing up slots.
  VaultManager.revokeAllOtherInstallations() delegates to the
  live vault XMTP client.

* Use static XMTP revocation that works without a live client

Client.revokeInstallations is a static method that doesn't need
a live client — just the signing key, inboxId, and API config.
This works even when Client.create() fails due to the 10-install
limit, which is exactly when you need to revoke.

XMTPInstallationRevoker: static utility that fetches all
installations via Client.inboxStatesForInboxIds, filters out
the current one, and revokes the rest via the static method.

* Broadcast all conversation keys to vault before backup

On a single device, vault key-sharing messages (DeviceKeyBundle)
only exist if multi-device pairing has happened. Without them,
the vault archive is empty of keys and restore can't recover
conversation identities.

Fix: call VaultManager.shareAllKeys() before creating the vault
archive. This broadcasts all conversation keys as a DeviceKeyBundle
message to the vault group, ensuring the vault archive always
contains every key needed for restore — regardless of whether
multi-device pairing has occurred.

This is the intended design: the vault is the single source of
truth for conversation keys, and backups should capture that
truth completely.

* Make vault key broadcast and archive creation non-fatal during backup

After 'delete all data', the vault may not be connected yet.
The backup should still succeed — it just won't include the
vault archive or keys. The conversation archives and GRDB
snapshot are still created.

On the next backup after the vault reconnects, the keys will
be broadcast and the vault archive will be included.

* Add plan for inactive conversation mode post-restore

* Resolve design decisions for inactive conversation mode

* Inactive conversation mode, iCloud vault key sync, and restore reliability fixes (#626)

* Update copy: 'comes online' -> 'sends a message'

* Add inactive conversation mode (Phases 1 & 2)

Phase 1 — data model:
- Add isActive column to ConversationLocalState (default true)
- GRDB migration addIsActiveToConversationLocalState
- Surface isActive: Bool on Conversation model via local state join
- Add setActive(_:for:) and markAllConversationsInactive() to
  ConversationLocalStateWriter + protocol
- RestoreManager calls markAllConversationsInactive() after
  importing conversation archives

Phase 2 — reactivation detection:
- After each syncAllConversations call site in SyncingManager
  (initial start, resume, discovery), query for conversations
  where isActive == false
- For each, call conversation.isActive() on the XMTP SDK
  (local MLS state check, no network)
- If true, update GRDB via setActive(true, for:)
- GRDB ValueObservation propagates to UI reactively

Phases 3 & 4 (UI) are blocked on design sign-off.

* Remove local settings from tracked files

* Stop tracking .claude/settings.local.json (already gitignored)

* Fix infinite recursion in createArchive protocol bridge

The protocol method createArchive(path:encryptionKey:) was calling
itself instead of the SDK's createArchive(path:encryptionKey:opts:)
because the compiler resolved the overload back to the protocol.
Explicitly cast to XMTPiOS.Client, same pattern as importArchive.

* Fix infinite recursion in importArchive protocol bridge

The cast to XMTPiOS.Client didn't help because the SDK method has
the exact same signature as the protocol method — Swift resolved
it back to the protocol bridge. Call through ffiClient directly
to bypass the protocol dispatch entirely.

* Remove archive protocol bridge methods causing infinite recursion

XMTPiOS.Client satisfies both createArchive and importArchive
protocol requirements directly via its SDK implementations.
No explicit bridge needed, no recursion risk.

* Remove createArchive/importArchive from XMTPClientProvider protocol

These methods don't belong on the messaging protocol. All callers
in the backup/restore layer use XMTPiOS.Client directly, not the
protocol abstraction. ConvosBackupArchiveProvider.buildClient now
returns Client instead of any XMTPClientProvider.

No wrapper, no recursion, no redundant protocol surface.

* Add diagnostic logging for inactive conversation reactivation

- Remove try? in RestoreManager markAllConversationsInactive to
  surface any errors
- Add per-conversation debug logging in reactivation check to
  trace isActive() return values
- Helps diagnose whether importArchive causes isActive() to
  return true immediately (defeating the inactive flag)

* Reactivate conversations on inbound message, change join copy

1. Reactivation: move from isActive() polling in SyncingManager to
   message-driven reactivation in StreamProcessor. When an inbound
   message arrives for a conversation with isActive=false, flip it
   to active immediately. This avoids the issue where importArchive
   causes isActive() to return true on stale MLS state.

2. Copy: change 'You joined as <name>' to 'Reconnected' for the
   system message shown when the current user's installation is
   re-added to a group.

* Revert 'Reconnected' copy change

The summary in ConversationUpdate has no restore context — changing
it here would affect all self-join scenarios, not just post-restore.
The copy change belongs in Phase 4 (UI layer) where isActive is
available to distinguish reconnection from first join.

* Show 'Reconnected' for post-restore self-join messages

Thread isReconnection flag through the message pipeline:
- DBMessage.Update: add isReconnection (default false, backward
  compatible with existing JSON)
- StreamProcessor: when an inbound message arrives for an inactive
  conversation, set isReconnection=true on the stored GroupUpdated
  message before reactivating the conversation
- ConversationUpdate: surface isReconnection from DB hydration
- summary: return 'Reconnected' instead of 'You joined as X'
  when isReconnection is true

Normal first-time joins are unaffected.

* Fix isReconnection decoding for existing messages

Add custom Decodable init to DBMessage.Update that defaults
isReconnection to false when the key is missing. Existing
GroupUpdated messages in the DB were stored before this field
existed, causing keyNotFound decoding errors and blank message
history.

* Document quickname backup as deferred for v1

* Implement inactive conversation UI (Phases 3 & 4)

Phase 3 — conversations list:
- Show 'Awaiting' label with dot separator in subtitle when
  conversation.isActive is false (same pattern as 'Verifying')
- Suppress unread/muted indicators for inactive conversations

Phase 4 — conversation detail:
- InactiveConversationBanner: pill overlay above composer with
  clock.badge.checkmark.fill icon, 'History restored' title,
  subtext explaining the state, tappable to learn.convos.org
- Intercept send, reaction, and reply actions when inactive —
  show 'Awaiting reconnection' alert instead of executing
- Disable top bar trailing item and text field when inactive
- All UI clears reactively when isActive flips to true via
  GRDB ValueObservation (no restart needed)

* Fix InactiveConversationBanner layout to match Figma design

* Update inactive banner: cloud.fill icon, 'Restored from backup'

- Change icon from clock.badge.checkmark.fill to cloud.fill
- Change title from 'History restored' to 'Restored from backup'
- Move banner from overlay into bottomBarContent so it participates
  in bottom bar height measurement — messages get proper inset and
  don't render behind the banner

* Use colorLava for cloud icon in inactive banner

* Fix reactivation for messages arriving via sync path

Messages can arrive through two paths:
1. Message stream → processMessage → markReconnectionIfNeeded ✓
2. Sync/discovery → processConversation → storeWithLatestMessages

Only path 1 had reactivation. Add reactivateIfNeeded after
storeWithLatestMessages in processConversation so the first
message (typically the GroupUpdated 'You joined') properly
clears the inactive state.

* Mark reconnection messages in sync path + re-adopt profiles after restore

1. reactivateIfNeeded now marks recent GroupUpdated messages as
   isReconnection=true before flipping isActive, so the 'You joined'
   message shows 'Reconnected' even when arriving via sync path.

2. ConversationWriter: when conversation is inactive (post-restore),
   force-apply member profiles from XMTP group metadata to GRDB,
   bypassing the 'existing profile has data' skip. This re-adopts
   display names and avatars from the group's custom metadata after
   restore, preventing 'Somebody' fallback.

* Show spinner only on active action during restore

Track which action is running via activeAction string. Spinner
only shows next to the active button. Hide the Actions section
entirely during restore so only the restore row with its spinner
is visible.

* Fix first message after restore not appearing in history

After restore, conversationWriter.store() calls conversation.sync()
which can fail on stale MLS state (new installation not yet re-added
to the group). This threw an error that prevented messageWriter.store
from ever running, so the first inbound message was lost.

Fix: if conversationWriter.store() fails, fall back to the existing
DBConversation from GRDB (which exists from the backup) and continue
storing the message. The conversation will be properly synced on the
next successful sync cycle.

Also refactor reactivateIfNeeded to extract markRecentUpdatesAsReconnection
for clarity.

* Try all vault keys on restore + detailed debug view

RestoreManager: try decrypting the backup bundle with every
available vault key (local + iCloud) instead of just the first.
This handles the case where device B created its own vault key
before iCloud sync propagated device A's key.

VaultKeySyncDebugView: add detailed sections showing:
- Each vault key with inboxId, clientId, and which stores
  (local/iCloud) it exists in, with green checkmark for active
- iCloud backup files with device name, date, size, inbox count

Makes it easy to diagnose vault key divergence between devices.

* Enable iCloud Keychain sync for vault keys

The iCloud vault keychain store was using
kSecAttrAccessibleAfterFirstUnlock but never set
kSecAttrSynchronizable — so items stayed local on each device
even under the same Apple ID.

Add synchronizable flag to KeychainIdentityStore and set it to
true for the iCloud vault store. This ensures vault keys
propagate between devices via iCloud Keychain, so device B can
decrypt backups created by device A.

Updated all iCloud vault store creation sites:
- ConvosClient+App.swift (production)
- BackupDebugView.swift
- VaultKeySyncDebugView.swift

* Add per-key actions in vault debug view

Each vault key now shows inline buttons:
- 'Delete local' — removes only the local copy for that inboxId
- 'Delete iCloud' — removes only the iCloud copy for that inboxId
- 'Adopt locally' — shown for iCloud-only keys, copies to local

This lets you surgically delete the wrong local vault key and
adopt the correct iCloud one without affecting other keys.

* Allow vault clientId update after restore or cross-device sync

After restore or iCloud key sync, the vault inboxId is the same
but the XMTP installation ID (clientId) is different because each
device gets its own installation. Previously this threw
clientIdMismatch and failed vault bootstrap.

Fix:
- InboxWriter: for vault inboxes, update clientId in GRDB instead
  of throwing when it changes
- VaultManager: re-save vault key to keychain with new clientId
  when it differs from the stored one (not just for vault-pending)

* Fix restore spinner not showing

runAction title was 'Restore' but actionLabel title was
'Restore from backup' — the mismatch meant activeAction
never matched and the spinner never appeared.

* Skip quickname onboarding for restored conversations

After restore, the conversation already has a profile from XMTP
group metadata. But UserDefaults (quickname state) aren't backed
up, so the onboarding coordinator thinks the user never set their
name and shows 'Add your name for this convo'.

Skip startOnboarding() entirely when conversation.isActive is
false — the profile is already restored from metadata.

* Update tests to include isActive and isReconnection parameters

MessagesListProcessorTests and related files need the new
parameters added in this branch:
- ConversationUpdate.isReconnection
- ConversationLocalState.isActive

* Fix delete all data leaving orphaned conversations (#628)

* Fix delete all data leaving orphaned conversations

InboxWriter.deleteAll() only deleted inbox rows — conversations,
messages, members, profiles, and all other tables were left behind.
No foreign key cascade from inbox to conversation exists, so the
data persisted and reappeared on next launch.

Now deletes all rows from every GRDB table in dependency order
within a single write transaction.

* Address code review: fix FK ordering, add test, add logging

- Fix foreign key ordering: conversation_members before invite
  (invite has FK referencing conversation_members)
- Add per-table error logging for debugging failed deletions
- Add testDeleteAllRemovesAllData: creates full data graph
  (inbox, conversation, member, profile, local state,
  conversation_members), calls deleteAll(), verifies all empty

* Address code review feedback on PR #626

Three fixes flagged by Macroscope and Claude review:

1. iCloud Keychain migration (critical)
   Add migrateToSynchronizableIfNeeded() that finds existing
   non-synchronizable items and re-saves them with
   kSecAttrSynchronizable: true. Without this, existing users
   would lose access to vault keys after enabling the
   synchronizable flag (queries with synchronizable:true don't
   match non-synchronizable items).
   Called at app launch from ConvosClient+App before the
   VaultKeyStore is created.

2. CancellationError propagation in StreamProcessor
   processMessage catch block for conversationWriter.store now
   rethrows CancellationError instead of falling through to the
   existing-DBConversation fallback. Prevents runaway work after
   task cancellation.

3. InboxWriter vault clientId update early-return
   When vault clientId changed, we were returning early and
   skipping the installationId update logic. Now we continue
   through to update installationId if the caller provided one.

* Add plan for vault re-creation on restore

After restore, the vault cannot be reactivated the same way
conversations can — the vault is the user's own group with no
third party to trigger MLS re-addition. This plan proposes
creating a fresh vault (new inboxId, new keys) on restore,
renaming the old vault DB as a read-only artifact, and
broadcasting the restored conversation keys to the new vault
so it becomes the source of truth for future device pairs.

* Add VaultManager.reCreate method for post-restore vault rebuild

Tears down the current vault and bootstraps a fresh one with new
keys. Used after restore to replace the inactive restored vault
with a working one.

Flow:
  1. Disconnect current vault client
  2. Delete old vault key from local + iCloud keychain
  3. Delete old DBInbox row for the vault
  4. Re-run bootstrapVault() which generates fresh keys and
     creates a new MLS group

The old vault XMTP database file is left on disk as an orphan
to avoid filesystem risk. A future cleanup pass can remove
orphaned vault DB files.

* Wire vault re-creation into RestoreManager

After marking conversations inactive, call VaultManager.reCreate
to tear down the restored (read-only) vault and bootstrap a new
one with fresh keys. Then broadcast the restored conversation
keys to the new vault via shareAllKeys so future device pairs
can sync through the new vault.

VaultManager is optional on RestoreManager to keep existing tests
working without a live vault. The backup debug view injects it
from the session.

If vault re-creation fails, the restore still completes — the
user has working conversations, just no functional vault. They
can retry or manually re-create later.

* Add comprehensive logging for vault re-creation flow

Tag-prefixed logs across the full re-creation path to make it
easy to verify on device:

- [Vault.reCreate] === START === / === DONE === boundaries
- Before/after snapshot of inboxId, installationId, bootstrap
  state, keychain key count, and GRDB vault inbox row count
- Step-by-step 1/4 through 4/4 markers
- Verification log comparing old vs new inboxId (warns loudly
  if they match — indicates failed re-creation)
- [Vault.bootstrap] logs XMTP connect, identity transition,
  keychain persist, and GRDB save
- [Vault.loadOrCreateVaultIdentity] logs whether existing key
  was found or fresh keys generated
- [VaultKeyCoordinator.shareAllKeys] logs row count, packaged
  key count, missing identities, and send success
- [Restore.reCreateVault] mirrors the same pattern on the
  restore side

* Update plans: revoke old vault in reCreate + stale device detection

- vault-re-creation: add step 8a to revoke device A's installation
  from the old vault before disconnecting, mirroring the per-
  conversation revocation that already happens. Steps re-lettered
  8a through 8g.
- stale-device-detection: new plan for detecting when the current
  device has been revoked (via client.inboxState refresh from
  network) and showing a 'device replaced' banner with a delete-
  data CTA. Covers detection triggers, storage, and UI.

* Revoke old vault installations before re-creating

Step 1/5 of VaultManager.reCreate now calls
revokeAllOtherInstallations on the old vault before disconnecting,
mirroring the per-conversation revocation that already happens
on restored conversation inboxes.

This kills device A's zombie vault installation so the old vault
is fully decommissioned (not just orphaned). Combined with the
conversation-level revocations, device A is now completely
replaced after restore.

Failure is non-fatal — logged as a warning, re-creation
continues. Worst case, device A keeps a zombie vault installation,
but the new vault on device B still works.

* Add isStale column to DBInbox with migration

Foundation for stale device detection: a per-inbox flag
indicating the current device's installation has been revoked
(e.g., because the user restored on a different device).
Defaults to false, nullable migration for existing rows.

* Add stale installation detection in InboxStateMachine

After inbox reaches ready state, call client.isInstallationActive
which queries XMTP inboxState(refreshFromNetwork: true) and checks
if our installationId is still in the active installations list.

If not → device was revoked (typically because user restored on
another device) → mark the inbox as stale in GRDB, emit QA event.

New XMTPClientProvider method isInstallationActive() wraps the
SDK call and returns Bool for clarity.

Added 'inbox' category to QAEvent for the stale_detected event.

* Also check stale installation on app foreground

When the app returns to foreground, inboxes might have been
revoked on another device (e.g., user restored on device B
while device A was backgrounded). Re-run isInstallationActive
check in handleEnterForeground to catch this on return.

* Surface isStale on Inbox model + reactive publisher

- Add isStale to Inbox domain model and DBInbox hydration
- Add InboxesRepository.anyInboxStalePublisher that emits true
  when any non-vault inbox has isStale = true
- Subscribe in ConversationsViewModel, expose isDeviceStale
  property for the view layer to bind to

* Add StaleDeviceBanner UI + hide stale conversations

- New StaleDeviceBanner view: exclamation icon, title, explanation,
  destructive 'Delete data and start fresh' button (wires into
  existing deleteAllData flow)
- Banner inserted above conversations list via safeAreaInset(edge:
  .top) when viewModel.isDeviceStale is true
- ConversationsViewModel: subscribe to staleInboxIdsPublisher and
  filter out any conversations belonging to stale inboxes, both
  on initial load and on reactive updates from GRDB
- Inline the isInstallationActive check in handleStart /
  handleEnterForeground to avoid Sendable issues passing
  any XMTPClientProvider across actor boundaries

* Fix synchronizable migration data loss risk

Macroscope flagged the migration as losing data if SecItemAdd
fails after the original item is deleted. Reverse the order:
add the synchronizable copy first, then delete the
non-synchronizable original — matching the pattern in
migrateItemToPlainAccessibility. This ensures the data is
never left unrecoverable.

If add fails: original is intact, retry next launch.
If delete fails after successful add: synchronizable copy is
in place, retry next launch (idempotent — the second add will
succeed with errSecDuplicateItem).

* Fail VaultManager.reCreate when old key deletion fails or inboxId matches

Two related fixes flagged by Macroscope:

1. If vaultKeyStore.delete(inboxId:) throws during step 3/5,
   we previously logged a warning and continued. The next
   bootstrap then found the old identity and reused it,
   producing the same inboxId — meaning no new vault was
   actually created. Now throws VaultReCreateError.bootstrapFailed.

2. If after bootstrap the new inboxId matches the old one
   (because deletion didn't actually remove the key), throw
   instead of just logging a warning. The function previously
   returned successfully despite re-creation having failed.

Both errors propagate up to RestoreManager.reCreateVault which
already catches and logs without aborting the restore — the
user can still see their conversations, just with no functional
vault, and can retry.

* decryptBundle: don't let staging dir reset failure escape the loop

If FileManager.createDirectory throws (e.g. disk full) between
key attempts, the error escaped the for loop and we surfaced
that confusing error instead of RestoreError.decryptionFailed.
Catch and log so the loop continues to the next key.

* shareAllKeys: only mark successfully shared inboxes

When identityStore.identity(for:) failed for an inbox, it was
skipped from the keys array but still marked sharedToVault=1
in the database. This caused inboxes whose keys were never
actually shared to be flagged as shared, preventing future
sync attempts.

Now we mark only the inboxIds that ended up in the keys array.

* Clear conversation selection when its inbox becomes stale

When staleInboxIdsPublisher fires and removes conversations from
the visible list, clear _selectedConversationId if the previously
selected conversation is no longer present. Otherwise the detail
view shows a conversation that no longer appears in the list.

Also fix the existing conversationsPublisher selection check to
inspect the filtered list (self.conversations) rather than the
unfiltered parameter, so stale selections are properly cleared
on either trigger.

* Fix vault re-create orphan rows + add inactive mode tests

Macroscope flagged a real bug at VaultManager:317: when oldInboxId
is nil, step 3 deletes all vault keys from the keychain but step 4
skipped DBInbox cleanup. The next bootstrapVault would fail with
InboxWriterError.duplicateVault. Now step 4 deletes all vault rows
when oldInboxId is nil, matching the keychain cleanup behavior.
Also escalated step 4 errors to throw VaultReCreateError.

Test coverage for previously-untested inactive mode logic:
- ConversationLocalStateWriterTests (5 tests):
  setActive toggle, scoped to one conversation, throws on
  unknown conversation, markAllConversationsInactive bulk
  flip, no-op on empty database
- InboxWriterTests adds 2 tests for markStale (toggle + scoping)
- DBMessageUpdateDecodingTests (4 tests):
  legacy JSON without isReconnection defaults to false,
  decode with isReconnection=true preserves value, encode
  round-trip, memberwise init default

46/46 tests pass in the affected suites.

* Refine stale-device UX and update recovery policy plan

* Add StaleDeviceState enum + reactive publisher

Derives healthy/partialStale/fullStale from the set of used
non-vault inboxes and how many are flagged isStale=true:
- healthy: no stale inboxes (or no used inboxes)
- partialStale: some but not all used inboxes are stale
- fullStale: every used inbox is stale (device effectively dead)

Drives the partial-vs-full UX policy described in
docs/plans/stale-device-detection.md. Existing
anyInboxStalePublisher and staleInboxIdsPublisher are kept
for backward compatibility.

* ConversationsViewModel: derive partial vs full stale state

Replace single isDeviceStale boolean with StaleDeviceState enum
driving:
- isDeviceStale: any inbox stale (banner visibility)
- isFullStale: every inbox stale
- canStartOrJoinConversations: false only when fullStale (was
  blocking new convos in any stale state previously — now allows
  them in healthy inboxes during partial stale)

Add handleStaleStateTransition that logs every transition and
schedules an auto-reset on healthy/partial → fullStale via
isPendingFullStaleAutoReset. The view layer binds to this flag
to render a cancellable countdown UI.

Add cancelFullStaleAutoReset / confirmFullStaleAutoReset for the
UI to call from the countdown.

* Stale device UI: partial vs full copy, countdown auto-reset

StaleDeviceBanner now takes a StaleDeviceState and renders
distinct copy + title for partial vs full:
- Partial: 'Some conversations moved to another device' / 'Some
  inboxes have been restored on another device. Resetting will
  clear local data...'
- Full: 'This device has been replaced' / 'Your account has been
  restored on another device. Resetting will clear local data...'

Action button renamed from 'Delete data and start fresh' to
'Reset device' so the destructive intent is in the verb itself
(per plan: 'Continue' was misleading).

StaleDeviceInfoView (Learn more sheet) now also varies copy by
state with two distinct title + paragraph sets.

New FullStaleAutoResetCountdown component shown via
fullScreenCover when ConversationsViewModel.isPendingFullStaleAutoReset
fires. 5-second countdown with Cancel and 'Reset now' actions.
The countdown task auto-fires onReset when remaining hits 0.

Wired into ConversationsView so when staleDeviceState transitions
to fullStale, the cover automatically presents.

* Add tests for StaleDeviceState derivation

8 tests covering the partial-vs-full state matrix:
- healthy when no inboxes
- healthy when inbox has no conversations
- healthy when used inbox is not stale
- partialStale when one of two used inboxes is stale
- fullStale when every used inbox is stale
- vault inbox is excluded from the count
- unused conversations don't make an inbox 'used'
- StaleDeviceState convenience flags (hasUsableInboxes,
  hasAnyStaleInboxes)

54/54 tests pass in affected suites.

* Vault bootstrap: log when installation is stale (diagnostic only)

After bootstrap, query inboxState(refreshFromNetwork: true) and
log a warning + QAEvent if our installationId isn't in the active
list. This means another device restored from backup and revoked
this vault installation.

The conversation-level stale detection (InboxStateMachine) is
what drives user-facing recovery — when a device is fully
replaced, all conversation inboxes go stale and the auto-reset
fires before vault state matters in practice.

This log is purely for diagnosing rare partial-revocation states
during QA. We don't add user-facing UI for vault staleness
because:
- in full-stale, conversation-level reset wipes the device anyway
- in partial-stale, the user isn't trying to use vault-dependent
  features in normal flows
- the only user-visible impact is 'pair new device' silently
  failing, which is rare and easily explained

* Don't dismiss compose flow on partial stale inbox updates

The staleInboxIdsPublisher sink was unconditionally nilling
newConversationViewModel whenever isDeviceStale was true, which
returns true for both partialStale and fullStale states. This
contradicted handleStaleStateTransition which explicitly keeps
the compose flow alive during partial stale so the user can
finish creating a conversation in a still-healthy inbox.

Remove the redundant check entirely — handleStaleStateTransition
is the single source of truth for closing in-flight flows on
transition to fullStale. The sink now only handles visible-list
filtering and selection clearing.

* Address Macroscope feedback on PR #618

Two low-severity findings on the restore flow:

1. VaultKeyStore.deleteLocal: the non-iCloud fallback called
   store.delete(inboxId:) which permanently deletes the key,
   contradicting the documented purpose of 'preserving the
   iCloud copy'. In production we always use ICloudIdentityStore
   so this only mattered in tests/mocks, but the contract was
   wrong. Now it's a no-op with a warning when the underlying
   store isn't iCloud-backed; tests that need to verify deletion
   should call delete(inboxId:) directly.

2. MockDatabaseManager.replaceDatabase didn't run migrations
   after restoring the database, unlike DatabaseManager which
   calls SharedDatabaseMigrator.shared.migrate after restore.
   This caused tests to pass while production could fail on
   schema mismatches. Now runs the migrator post-restore to
   match production behavior.

* BackupBundle: harden path traversal check against symlinks

Macroscope flagged that the existing standardizedFileURL check
doesn't resolve symlinks, but fileData.write(to:) follows them.
A pre-existing symlink under the staging dir could escape the
first-pass check.

Fix: after creating the parent directory, re-validate with
.resolvingSymlinksInPath() on the parent. If the resolved path
is no longer inside the staging dir (because a symlink redirected
it), throw 'path traversal via symlink'.

* RestoreManager: throw on missing vault archive

Macroscope flagged that importVaultArchive returned an empty
array when the vault archive was missing from the bundle, but
restoreFromBackup would still continue to wipe local XMTP state
and replace the database — silent data loss with no keys to
decrypt the resulting conversations.

The RestoreError.missingVaultArchive case was already defined
but never thrown. Throw it now before any destructive operation.

* RestoreManager: stop logging success after a try? swallowed an error

wipeLocalXMTPState was logging 'cleared conversation keychain
identities' unconditionally after try? await identityStore.deleteAll().
The success message appeared even when deleteAll threw and no
identities were actually cleared, hiding restore failures during
debugging.

Convert to do/catch and log a warning on failure (non-fatal —
restore continues either way).

* DatabaseManager: surface rollback failure as a dedicated error

Macroscope flagged that if the restore failed AND the rollback
also failed (e.g. migration threw on the rolled-back snapshot),
the catch block swallowed the rollback error and re-threw only
the original. Caller had no way to know the DB was now in a
potentially-inconsistent state.

Add DatabaseManagerError.rollbackFailed wrapping both the
original and rollback errors. Throw it instead of the original
when the rollback path itself throws — caller can distinguish
'restore failed cleanly' from 'restore failed and we couldn't
recover'.

* Update test for missing-vault-archive: now expects abort

The 'missing vault archive is non-fatal' test was asserting the
old behavior. Updated to assert the restore aborts before any
destructive operation runs and the conversation archive importer
is never invoked.

* Inbox: decode isStale with backward-compatible fallback

Macroscope and Claude flagged that adding isStale as a non-optional
stored property broke Codable backwards compatibility — Swift's
synthesized Decodable ignores default init parameter values, so any
JSON payload missing 'isStale' would fail to decode with keyNotFound.

Inbox is currently only hydrated from GRDB columns (not JSON), so
there's no active bug, but the Codable conformance is a footgun for
any future JSON path (cache, push payload, etc).

Add a custom init(from:) that uses decodeIfPresent for isStale with
a 'false' fallback. No API change.

* VaultManager.bootstrap: save-first-then-delete on keychain update

Macroscope and Claude flagged the delete-then-save-with-try? pattern
when the vault identity is being persisted/updated. If the delete
succeeded but the save failed (keychain access denied, etc), the
vault key was permanently lost — the next bootstrap would generate
a new inboxId and InboxWriter.save would throw duplicateVault
because the old GRDB row still existed.

Fix: save the new/updated entry first (which will throw on failure
and abort the bootstrap with the old key still intact), then delete
the old entry only if save succeeded. Delete failures are now
non-fatal because the vault is already recoverable from the new
entry — they just leave an orphan keychain item that gets cleaned
up on the next cycle.

Mirrors the same add-first-then-delete pattern we use for the
iCloud Keychain synchronizable migration.

* ConversationsViewModel: recompute visible list from unfiltered source

Macroscope and Claude flagged a real UX bug: when an inbox recovers
from stale back to healthy, its conversations remained permanently
hidden. The staleInboxIdsPublisher sink was filtering
self.conversations in-place with
  self.conversations = self.conversations.filter { ... }
which destroyed the source list. When a smaller set of stale IDs
later emitted, the filter ran against an already-filtered list and
the recovered conversations never reappeared. conversationsPublisher
observes a different table and only re-emits on DB change, so
nothing triggered re-hydration.

Fix: cache the publisher's output in unfilteredConversations and
derive the visible list via recomputeVisibleConversations(), called
from both the stale-ids sink and the conversations sink. Also call
it after hiddenConversationIds.remove so post-explosion rollback
behaves correctly.

* StreamProcessor: handle CancellationError explicitly in processMessage

Macroscope flagged that 'throw CancellationError()' inside the inner
do/catch for conversationWriter.store was immediately caught by the
outer generic 'catch {}' and logged as an error rather than
propagated — defeating cooperative cancellation.

Since processMessage is declared 'async' (not 'async throws') and
changing the signature would ripple across the StreamProcessor
protocol and six call sites, handle it locally: add a dedicated
'catch is CancellationError' arm to both do/catch blocks that logs
at debug level and returns early.

The enclosing runMessageStream loop calls Task.checkCancellation()
at the top of every iteration, so cancellation still exits the
stream task — just one message later than ideal, which is
acceptable for cooperative cancellation semantics.

* ConversationsViewModel: seed unfilteredConversations and fix explode ordering

Two follow-up bugs from the unfilteredConversations refactor that
Macroscope caught:

1. (Medium) init populated self.conversations from fetchAll() but
   left unfilteredConversations empty. When observe() subscribed to
   staleInboxIdsPublisher, GRDB's ValueObservation fired immediately,
   triggering recomputeVisibleConversations against an empty source
   and wiping the initial list. Seed both lists from the fetch.

2. (Low) explodeConversation was mutating self.conversations in
   place with removeAll, bypassing the unfiltered cache, and on
   success the conversation would briefly resurface because
   recomputeVisibleConversations filtered an unfiltered list that
   still contained it.

   Rework to the optimistic-hide pattern: insert into
   hiddenConversationIds and recompute (hides immediately), then on
   success drop from unfilteredConversations BEFORE clearing the
   hide marker (so the recompute doesn't resurface it), or on
   failure just clear the hide marker to bring it back.

* Restore reliability: rollback, revocation, and safety guards (#694)

* docs: add backup/restore follow-ups plan for deferred items

* BackupManager: make vault archive creation fatal

createVaultArchive was swallowing errors and allowing backups to
complete without a vault archive. Since the vault archive is the
only source of conversation keys during restore, a backup without
it is silently unrestorable — the user thinks they have a backup
but it contains nothing to decrypt the restored conversations.

broadcastKeysToVault was already fatal (throws BackupError), but
createVaultArchive (which exports the XMTP DB containing those
keys) was non-fatal. This mismatch meant: keys get sent to the
vault, but if the archive export fails, the backup proceeds
without them. On restore, either missingVaultArchive is thrown
(after the recent fix) or the old code would silently wipe all
local data.

Make createVaultArchive throw on failure so the backup fails
visibly and is not saved as 'latest'.

* Reactivate restored conversations when XMTP sync confirms them

After restore, all conversations are marked inactive ('Awaiting
reconnection'). Reactivation was only triggered by incoming
messages (processMessage → markReconnectionIfNeeded) or by
processConversation during sync. But discoverNewConversations
skips every group already in the DB — and after restore, ALL
groups are already in the DB. So quiet conversations with no new
messages would stay 'Awaiting reconnection' indefinitely.

Fix: after discoverNewConversations lists all groups from the
XMTP client, pass the full set of confirmed group IDs to a new
reactivateRestoredConversations method on StreamProcessor. This
flips inactive conversations to active for any group the XMTP
sync confirmed is present — proving the restored XMTP client can
participate in those conversations.

The method also marks recent update messages as reconnection
(existing markRecentUpdatesAsReconnection logic) so the UI shows
'Reconnected' feedback where appropriate.

* BackupManager: add BackupError and make key broadcast fatal

Add a dedicated BackupError.broadcastKeysFailed error type.
broadcastKeysToVault now throws on failure instead of logging a
warning and continuing. If keys can't be broadcast to the vault,
the archive won't contain them and the backup would be silently
unrestorable.

* VaultManager.reCreate: use stateless XMTPInstallationRevoker

The local vault client has been paused by prepareForRestore when
reCreate runs, so its connection pool is disconnected. Using the
instance method revokeAllOtherInstallations hits 'Pool needs to
reconnect before use'. Switch to the static
XMTPInstallationRevoker.revokeOtherInstallations which creates
its own short-lived client. Also revoke ALL installations
(keepInstallationId: nil) since reCreate abandons the whole vault
identity.

* Restore: add rollback, revocation, and pre-destructive safety guards

Major restore reliability overhaul:

- XMTP file staging: move files aside instead of deleting, so they
  can be restored if the restore fails before the commit point
- Keychain snapshot + rollback: snapshot existing identities before
  clearing, restore them on pre-commit failure
- Commit point: after DB replace + keychain save, staged files are
  discarded and further failures are non-fatal
- Pre-destructive guards: abort if vault yields 0 keys but backup
  had N conversations (incompleteBackup), or if every keychain save
  failed (keychainRestoreFailed)
- Post-import installation revocation: importConversationArchive
  now returns the new installation ID; RestoreManager revokes all
  other installations per inbox so device A detects stale state
- ConvosVaultArchiveImporter: add defer for cleanup on failure
- Debug view: wire installationRevoker into RestoreManager
- Tests: update mock to return installation IDs

* BackupManager: add log lines between vault and conversation archive steps

The missing logs made XMTP SDK output from conversation archive
creation (Client.build) look like it was happening during vault
archive creation, causing confusion during log analysis.

* BackupManager: add [Backup] prefix to all log lines

Consistent prefix on every log line so the entire backup flow
can be filtered with a single …
lourou added a commit that referenced this pull request Apr 13, 2026
* Refactor KeychainIdentityStore for iCloud-ready accessibility

* Add ICloudIdentityStore dual-write keychain coordinator

* Wire vault key store to iCloud-backed identity store

* Expand keychain and iCloud identity store test coverage

* Add guided physical-device vault key sync QA script

Adds qa/scripts/physical-vault-key-sync-check.sh to run a one-device, USB-connected manual flow with automated log capture and report generation.

Usage:
./qa/scripts/physical-vault-key-sync-check.sh "10u15-mini"

What it does:
- resolves the connected device by name/UDID
- captures idevicesyslog output
- prompts the tester through initial run and relaunch steps
- validates vault bootstrap and iCloud sync signals in logs
- writes report + raw logs to qa/reports/

* Fix physical QA script bootstrap counter and app preflight

Fixes a shell bug where grep count returned '0\n0' and broke arithmetic checks.\n\nAlso improves reliability by:\n- rendering multiline instructions correctly\n- checking whether org.convos.ios-preview is installed on the device\n- attempting install from .derivedData device build when available\n- showing clear guidance when no installed/built app is found

* Fix physical QA script step timing and wait visibility

- capture bootstrap baseline before each guided step\n- wait for a *new* vault bootstrap signal after user actions\n- show progress output every 10s so it no longer appears stuck\n- fallback to existing first-run bootstrap log when already present

* Add Vault Key Sync debug panel with one-device recovery actions

Adds App Settings > Debug > Vault key sync with:\n- iCloud availability, vault bootstrap state, inbox id, local/iCloud key counts\n- hasICloudOnlyKeys restore signal\n- actions: refresh, sync local->iCloud, simulate restore (delete local only), recover local from iCloud, delete iCloud copy, re-sync\n- confirmations for destructive actions and accessibility identifiers for QA automation

* Clarify iCloud account vs iCloud Keychain availability

Renames the primary availability API to isICloudAccountAvailable and documents that ubiquityIdentityToken only indicates iCloud account presence.\n\nKeeps isICloudAvailable as a deprecated compatibility alias, updates debug UI copy, and renames the corresponding test.

* Harden keychain format migration rename failure handling

Prevents migration completion from being marked when any item fails, so failed migrations retry on next launch.\n\nAlso derives migration target account from decoded identity inboxId, allowing recovery of previously stranded *.migrated entries, and adds tests for stranded-account recovery and completion-flag behavior on partial failures.

* Fail fast when stdbuf is unavailable in physical QA script

Adds an explicit stdbuf dependency check before starting idevicesyslog capture so missing tooling fails early with a clear error instead of producing empty logs and timeout-based CHECK_MANUALLY results.

* Prevent repeated .migrated suffix growth during keychain migration

Migration now treats existing *.migrated entries as temporary sources instead of creating nested *.migrated.migrated accounts.\n\nAlso adds duplicate-target reconciliation (cleanup when payload matches) and errSecItemNotFound reconciliation for stale snapshot iterations, while keeping retry behavior for unresolved conflicts. Tests now cover stranded-account recovery with no nested suffix artifacts and partial-failure completion-flag behavior with conflicting payloads.

* Add backup bundle creation with AES-GCM encryption (#603)

* Add backup bundle creation with AES-GCM encryption

Implements the backup orchestrator (PR 3 from icloud-backup plan):

BackupManager: actor that coordinates the full backup flow
  - creates vault XMTP archive via BackupArchiveProvider protocol
  - creates per-conversation XMTP archives (failure per-conversation
    is non-fatal, logged as warning)
  - copies GRDB database via GRDB's backup API (consistent snapshot)
  - writes metadata.json (version, device, timestamp, inbox count)
  - packages everything into a tar, encrypts with AES-256-GCM using
    the vault key's databaseKey
  - writes to iCloud Drive (Documents/backups/<device-id>/) with
    local fallback when iCloud container is unavailable

BackupBundle: staging directory layout, tar/untar, pack/unpack
BackupBundleCrypto: AES-256-GCM encrypt/decrypt with 32-byte key
BackupBundleMetadata: versioned JSON metadata (v1)
BackupArchiveProvider: protocol for vault + conversation archiving,
  decoupled from VaultManager/XMTP for testability

16 tests across 4 suites:
  - metadata JSON round-trip
  - AES-GCM encrypt/decrypt, wrong key, invalid key, empty/large data
  - tar/untar round-trip, encrypted pack/unpack, wrong key rejection
  - full backup with mock archives, partial failure resilience,
    bundle decryptability, GRDB database integrity, empty inbox list

* Harden archive format against path traversal and symlink escape

Code review fixes:
- untar: validate resolved path stays within destination directory,
  reject entries with .. traversal
- tar: validate resolved file stays within source directory, reject
  symlinks pointing outside
- crypto: package access control, preserve underlying error context
  instead of losing it via localizedDescription
- backup manager: log individual conversation errors with inboxId
  instead of just the count
- add test for path traversal rejection in untar

* Add concrete BackupArchiveProvider and wire XMTP archive APIs

The backup bundle was using a protocol with only a mock
implementation. This adds the real wiring:

- Add createArchive/importArchive to XMTPClientProvider protocol
  with explicit implementations in the XMTPiOS.Client extension
  (SDK method has an extra opts parameter with default value)
- Add ConvosBackupArchiveProvider: concrete implementation that
  delegates vault archiving to VaultServiceProtocol and builds
  temporary XMTP clients via Client.build() (local-only, no
  network) for per-conversation archiving
- Update MockXMTPClientProvider and SyncingManagerTests mock
  with the new protocol methods

* Write metadata.json alongside encrypted bundle for restore detection

PR 4's restore flow needs to inspect backup metadata (timestamp,
device name, inbox count) without decrypting the bundle first.
Write an unencrypted metadata.json next to backup-latest.encrypted
in the backup directory. The same metadata is also inside the
encrypted bundle for integrity.

Matches the plan's iCloud Drive layout:
  backups/<device-id>/metadata.json
  backups/<device-id>/backup-latest.encrypted

Add BackupBundleMetadata.exists(in:) helper for restore detection.
Test verifies sidecar metadata is written with correct values.

* Add iCloud container entitlements and dynamic container identifier

Register iCloud Documents capability per environment:
- Local:  iCloud.org.convos.ios-local
- Dev:    iCloud.org.convos.ios-preview
- Prod:   iCloud.org.convos.ios

Add ICLOUD_CONTAINER_IDENTIFIER to all xcconfig files, derived
from MAIN_BUNDLE_ID (same pattern as APP_GROUP_IDENTIFIER).

Add icloud-container-identifiers and ubiquity-container-identifiers
to Convos.entitlements using the build variable.

Add AppEnvironment.iCloudContainerIdentifier computed property
(derived from appGroupIdentifier) and pass it explicitly to
FileManager.url(forUbiquityContainerIdentifier:) instead of nil.

* Fix integer overflow crash and double error wrapping

- untar: validate UInt64 file length fits in Int before conversion,
  preventing a trap on malicious data with length >= 2^63
- crypto: move guard outside do-catch so the encryptionFailed case
  from the guard is not re-wrapped by the catch block

* Add backup debug panel for manual backup creation

New debug view next to Vault Key Sync in the debug settings:
- shows iCloud container availability, last backup metadata
  (timestamp, device name, inbox count, bundle version)
- 'Create backup now' button triggers a full backup
- wires BackupManager with ConvosBackupArchiveProvider,
  VaultKeyStore (iCloud dual-store), and identity store
- reads sidecar metadata.json for status without decrypting

* Fix debug view actor isolation and metadata access control

- move makeBackupManager() call outside the @Sendable closure
  to avoid main-actor isolation violation
- make BackupBundleMetadata.read(from:) public so the main app
  target can read sidecar metadata

* Remove deprecated shouldUseExtendedBackgroundIdleMode

Deprecated in iOS 18.4 and marked as not supported. The background
URLSession configuration already handles extended background
behavior via sessionSendsLaunchEvents and isDiscretionary.

* Fix debug view missing local fallback for backup directory

The debug view only checked iCloud and returned nil when
unavailable, missing backups saved locally. Now mirrors
BackupManager's fallback: iCloud first, then local directory.
Made defaultDatabasesDirectoryURL public for main app access.

* Separate iCloud availability from backup directory resolution

iCloud container status is now checked independently so the debug
view correctly shows 'Unavailable' even when a local backup exists.

* Add restore flow: decrypt bundle, import vault, replace DB (#618)

* Add restore flow: decrypt bundle, import vault, replace DB, import archives

RestoreManager actor orchestrates the full restore:
  1. Load vault key from VaultKeyStore (iCloud fallback)
  2. Decrypt and unpack backup bundle
  3. Import vault archive via VaultServiceProtocol → [VaultKeyEntry]
  4. Save each key entry to local keychain
  5. Replace GRDB database (close pool → swap file → reopen + migrate)
  6. Import per-conversation XMTP archives (failure per-inbox is non-fatal)

RestoreState enum tracks progress through each phase for UI binding.

Restore detection:
  - findAvailableBackup() scans iCloud Drive + local fallback for
    sidecar metadata.json, returns newest backup across all devices
  - Combined with ICloudIdentityStore.hasICloudOnlyKeys() for full
    restore detection (PR 6 UI)

DatabaseManager changes:
  - dbPool changed from let to private(set) var
  - replaceDatabase(with:) closes pool, removes DB + WAL/SHM files,
    copies backup, reopens with migrations
  - DatabaseManagerProtocol gains replaceDatabase requirement
  - MockDatabaseManager implements via erase + backup-to for tests

ConvosRestoreArchiveImporter: concrete implementation builds
temporary XMTP client via Client.build() per inbox and imports

7 tests: full flow, DB replacement, partial failure, missing vault
archive, restore detection, nil detection, state progression

* Make DatabaseManager.replaceDatabase safe with rollback

Validate backup opens before closing the active pool. Move the
current DB aside as a safety net. If copy or reopen fails, restore
the original database so the app is never left with no usable DB.

* Validate backup before erasing MockDatabaseManager

Stage backup into a temp queue first. Only erase the active
in-memory DB after confirming the backup is valid and copyable.

* Fix findAvailableBackup to check local when iCloud is empty

iCloud container available but with no backup was returning nil
instead of falling through to the local directory. Chain the
checks: try iCloud first, fall back to local if not found.

* Track failed key saves in RestoreState.completed

Add failedKeyCount to the completed state so the UI can warn
the user about conversations that may not be decryptable.
saveKeysToKeychain now returns the failure count instead of
throwing.

* Add restore button to backup debug panel

Shows available backup metadata (timestamp, device, inbox count)
and a destructive 'Restore from backup' button with confirmation
dialog. Restore section only visible when databaseManager is
provided (optional param). Reports failedKeyCount on completion.

Thread databaseManager through DebugExportView → DebugViewSection
→ BackupDebugView as an optional. Expose ConvosClient.databaseManager
as public for debug wiring.

* Wire databaseManager through to backup debug panel for restore

Thread databaseManager from ConvosApp → ConversationsViewModel →
AppSettingsView → DebugExportView → DebugViewSection →
BackupDebugView so the restore button can call
DatabaseManager.replaceDatabase().

All params are optional so previews and mocks still compile
without it.

* Fix false path traversal rejection on iOS devices

The path traversal check compared resolved paths using hasPrefix
with a hardcoded '/' separator, but on iOS devices directory URLs
created with isDirectory:true can have a trailing slash in their
resolved path. This caused 'resolvedDirPath + "/"' to become
'.../uuid//' which never matched, rejecting all files including
legitimate ones like database.sqlite.

Extract a resolvedPath() helper that strips trailing slashes after
resolving symlinks. Use it consistently in both tar and untar.

* Fix untar path resolution inconsistency on iOS devices

resolvingSymlinksInPath() behaves differently for existing
directories vs non-existent files on iOS: the directory resolves
/private/var → /var but the file keeps /private/var since it
doesn't exist on disk yet. This caused the prefix check to fail.

Build the file path from the already-resolved directory path
instead of resolving it independently. Use standardizedFileURL
(which handles .. without touching the filesystem) instead of
resolvingSymlinksInPath for the file component.

* Fix vault backup restore after local wipe and stabilize DB restore

Restore had two independent failure modes in the backup -> restore flow.

First, deleting all app data called VaultManager.unpairSelf(), which cleaned up the local vault state by deleting the vault key through VaultKeyStore.delete(inboxId:). When the underlying store was ICloudIdentityStore, that removed both the local and iCloud keychain copies. Any existing backup bundle had been encrypted with the old vault database key, so a subsequent app relaunch generated a fresh vault identity and restore then failed with AES-GCM authenticationFailure because the app was attempting to decrypt the bundle with a different key.

Second, RestoreManager replaced the GRDB database while the app still held long-lived readers, writers, inbox services, vault activity, and background restore-adjacent tasks. DatabaseManager.replaceDatabase(with:) also recreated the DatabasePool instance entirely. That meant restore was vulnerable to crashing or destabilizing the app because live services and captured dbReader/dbWriter references could become stale during replacement.

This change addresses both problems.

Vault key handling
- add ICloudIdentityStore.deleteLocalOnly(inboxId:) so restore-related local cleanup can preserve the iCloud backup key
- add VaultKeyStore.deleteLocal(inboxId:) as the vault-facing API for that behavior
- keep iCloud key preservation scoped only to the local delete-all-data/unpair path
- keep remote self-removal strict by continuing to delete both local and iCloud copies when this device is removed from the vault by another device

Restore lifecycle coordination
- add RestoreLifecycleControlling to let RestoreManager explicitly quiesce and resume runtime state around destructive restore work
- wire BackupDebugView to pass the current session as the restore lifecycle controller
- make SessionManager conform by pausing the import drainer, sleeping inbox checker, inbox lifecycle, and vault activity before database replacement, then resuming restore-safe pieces and notifying GRDB observers afterward

Database replacement safety
- change DatabaseManager.replaceDatabase(with:) to preserve the existing DatabasePool instead of closing it and constructing a new pool
- copy the current database into an in-memory rollback queue before replacement
- restore the backup contents into the existing pool, then rerun migrations
- on failure, restore the rollback contents back into the same pool and rerun migrations
- add an explicit missing-file guard before attempting replacement

Tests
- add VaultKeyStore coverage proving deleteLocal preserves the iCloud vault copy and full delete still removes both copies
- add RestoreManager coverage proving restore lifecycle hooks are called around restore
- add DatabaseManager restore coverage proving captured readers remain valid after database replacement

Validation
- swift test --package-path ConvosCore --filter '(VaultKeyStoreTests|RestoreManagerTests|DatabaseManagerRestoreTests)'
- xcodebuild build -project Convos.xcodeproj -scheme 'Convos (Dev)' -destination 'id=00008110-001C51283AC2801E' -derivedDataPath .derivedData

Result
- backup decryption can continue to use the original vault key after delete-all-data / app relaunch
- destructive restore now pauses active app state before swapping database contents
- existing GRDB handles remain usable after restore instead of silently pointing at a retired pool

* Tighten restore barriers and exclude unused inboxes from backup counts

This follow-up hardens the restore path and fixes backup metadata/archive counts that were inflated by the pre-created unused conversation inbox.

Restore hardening
- wrap destructive database replacement work in DatabasePool.barrierWriteWithoutTransaction so the restore waits for in-flight GRDB accesses on the live pool to drain before copying rollback data and before copying restored data into place
- keep the existing in-place DatabasePool replacement strategy, but make the source->destination copy and rollback copy happen behind an explicit GRDB barrier
- keep migrations after replacement so schema is still reopened at the current app version
- keep rollback-on-failure behavior intact

Session restore follow-up
- when prepareForRestore runs, clear the cancelled unused inbox preparation task reference
- when finishRestore runs, schedule unused inbox preparation again so the post-restore app returns to the same steady-state behavior as a normal session

Backup count / archive fix
- add InboxesRepository.nonVaultUsedInboxes()
- this returns only non-vault inboxes that have at least one non-unused conversation row
- BackupManager now uses that query when deciding which per-conversation archives to create
- RestoreManager now uses the same query for its completed inbox count

This keeps the vault backup behavior unchanged and only excludes the synthetic unused conversation inbox from the non-vault conversation archive path.

Important: the vault is still always backed up.
- BackupManager still creates the vault archive first
- the overall backup bundle is still encrypted with the vault databaseKey
- restore still depends on the vault archive plus the vault key to recover conversation keys

What changes from the user-visible perspective
- backup metadata inboxCount now matches real visible/restorable conversations instead of counting the hidden unused placeholder inbox
- restore completed counts now match that same definition
- the extra placeholder inbox is no longer archived as if it were a real conversation
- the vault remains part of every backup exactly as before

Tests added/updated
- backup tests now seed real conversation rows for active inboxes
- add coverage that unused conversation inboxes are excluded from backup metadata and archive generation
- add coverage that restore completed counts exclude unused conversation inboxes
- existing restore/database tests continue to cover lifecycle hooks and stable DB handle behavior

Validation
- swift test --package-path ConvosCore --filter '(BackupManagerTests|RestoreManagerTests|VaultKeyStoreTests|DatabaseManagerRestoreTests)'
- xcodebuild build -project Convos.xcodeproj -scheme 'Convos (Dev)' -destination 'id=00008110-001C51283AC2801E' -derivedDataPath .derivedData

* Fix vault archive import crash by using temporary XMTP client

The live VaultClient's DB connection was in an inconsistent state
after pause() (dropLocalDatabaseConnection), causing importArchive
to crash in the XMTP FFI layer.

Replace the live vaultService.importArchive call with a temporary
XMTP client built from the vault key. This matches the real restore
scenario (fresh device, no running vault) and avoids conflicts with
the live vault's connection state.

- Extract VaultArchiveImporter protocol for testability
- ConvosVaultArchiveImporter: builds temp client, imports archive,
  extracts keys from vault messages, tears down
- Remove vaultService dependency from RestoreManager entirely
- Revert barrierWriteWithoutTransaction (deadlock) back to plain
  backup(to:) for DB replacement
- Update tests with MockVaultArchiveImporter

* Fall back to Client.create when Client.build fails during restore

On a fresh device (real restore), no local XMTP database exists,
so Client.build fails. Fall back to Client.create with the signing
key, same pattern as InboxStateMachine.handleAuthorize.

Also add vault-specific codecs to the temporary client options so
vault messages can be decoded for key extraction.

Applied to both ConvosVaultArchiveImporter and
ConvosRestoreArchiveImporter.

* Extract keys from restored GRDB instead of live vault import

The XMTP SDK's importArchive crashes when called on a client
whose local database already has state (same-device scenario) or
when called on a just-created client. The vault messages are
already in the backup's GRDB database, so we don't need the XMTP
archive import to extract keys.

New restore order:
1. Replace GRDB database with backup copy
2. Extract keys from vault messages in the restored DB
3. Save keys to keychain
4. Import vault XMTP archive (best-effort, non-fatal)
5. Import conversation XMTP archives

The vault XMTP archive import is now best-effort — if it fails
(e.g. on same-device test), the restore still succeeds because
key extraction comes from GRDB, not from the XMTP archive.

* Skip vault XMTP archive import to avoid FFI crash

The XMTP SDK's importArchive causes a Rust panic (uncatchable from
Swift) when importing into a client that already has state. Since
key extraction now reads from the restored GRDB database, the vault
XMTP archive import is unnecessary. The vault will rebuild its XMTP
state when it reconnects on next launch.

* Extract keys from live vault before teardown, not after

Vault messages (DeviceKeyBundle/DeviceKeyShare) live in the XMTP
database, not GRDB. Restoring the GRDB database doesn't restore
vault messages. A fresh temporary XMTP client can't access
historical vault group messages either (different installation).

Fix: extract keys from the LIVE connected vault client while it's
still running, BEFORE the lifecycle teardown. The vault already
has all the messages in its local XMTP database.

New restore order:
1. Decrypt bundle
2. Extract keys from LIVE vault (VaultManager.extractKeys)
3. Stop sessions (prepareForRestore)
4. Replace GRDB database
5. Save extracted keys to keychain
6. Import conversation XMTP archives
7. Resume sessions (finishRestore)

* Use XMTP archive import as designed — no DB swaps or live extraction

The XMTP SDK's importArchive is additive and designed for
backup/restore across installations. Use it correctly:

Restore flow:
1. Decrypt bundle
2. Stop all sessions
3. Import vault archive: Client.create(signingKey) then
   importArchive(path, key) — creates a fresh XMTP client and
   imports vault message history including all key bundles/shares
4. Extract keys from imported vault messages
5. Save keys to keychain
6. Replace GRDB database
7. Import conversation archives: Client.create(signingKey) then
   importArchive(path, key) per conversation
8. Resume sessions

Key fixes:
- Vault importer uses NO dbDirectory (matches VaultManager's
  default = app Documents), was wrongly using app group container
- Conversation importer uses defaultDatabasesDirectory (matches
  InboxStateMachine's app group container)
- Both use Client.create (not build) since we want fresh
  installations — importArchive is additive
- Removed vaultService dependency from RestoreManager entirely
- After import, conversations start inactive/read-only per MLS
  and reactivate automatically as other members come online

* Use isolated temp directory for vault archive import

importArchive crashes (Rust panic) when the vault's XMTP database
already exists on disk with the same data. Create a fresh temporary
directory for the import client so there's no conflict with the
live vault's database files.

This works for both same-device testing (existing DB) and real
restore (fresh device, no DB). The temp directory is cleaned up
after key extraction.

* Disable device sync workers for archive import clients

The device sync worker was running concurrently with importArchive,
causing a race condition in the Rust FFI layer that crashed the app.
Archive import clients are temporary — they only need to import
data and extract keys, not participate in device sync.

* Add import success log for vault archive

* Skip vault archive import when conversation keys already in keychain

Vault importArchive crashes when called on a device that already has
a live installation for the same inboxId. More importantly, it's
unnecessary: if conversation keys are already in the local keychain,
the vault archive contains nothing we don't have.

Before attempting vault archive import, check which conversation
inboxIds from the bundle are missing from the keychain:

  Same device (all keys present):
    → skip vault archive import entirely
    → restore GRDB + import conversation archives only

  Fresh device or empty vault (keys missing):
    → import vault archive to recover missing keys
    → save extracted keys to keychain
    → restore GRDB + import conversation archives

This correctly handles all scenarios including a device where the
vault bootstrapped from iCloud Keychain but its XMTP database is
empty (no DeviceKeyBundle/Share messages yet).

* Skip conversation archive import when XMTP DB already exists

importArchive is additive — it restores message history into a
fresh installation. If the XMTP DB already exists (same device),
Client.build() succeeds and there is nothing to import.

Same-device: build() succeeds → skip importArchive, history is there
Fresh device: build() fails → create() + importArchive()

This mirrors the vault import logic and avoids the FFI crash from
importing into an existing installation.

* Skip vault importArchive when vault XMTP DB already exists

Same pattern as conversation archives: try Client.build() first.
If vault DB exists on disk, extract keys directly from the
existing vault database instead of importing the archive.
Only import the archive on a fresh device where build() fails.

This fixes the intermittent crash that occurred when the GRDB
restore caused some conversation keys to appear missing from
the keychain, triggering vault importArchive on same-device.

* Wipe local XMTP state before restore for clean deterministic behavior

Restore is destructive: the device should look exactly like the
backup, nothing more, nothing less. Partial state from the current
device (deleted conversations, orphaned DBs, mismatched keys)
was causing intermittent crashes and inconsistent behavior.

Before importing vault archive:
- identityStore.deleteAll() removes all conversation keychain keys
- Delete all xmtp-* files from AppGroup + Documents directories

After wipe, Client.build() fails for all inboxes (no local DB),
so both vault and conversation importers always take the
create() + importArchive() path — consistent regardless of
device state.

The vault key (vaultKeyStore) is preserved since it lives in a
separate keychain service and is needed to decrypt the bundle.

* Extract vault keys before wipe to avoid broken state on failure

If vault archive import fails after the wipe, the device is left
with no keys and no restored data — unrecoverable. Extract keys
first while the vault is still intact, then wipe once we have
something to restore.

New order:
1. Stop sessions
2. Import vault archive + extract keys (vault DB still present)
3. Wipe XMTP DBs + conversation keychain identities
4. Save extracted keys to keychain
5. Replace GRDB
6. Import conversation archives (fresh DBs)
7. Resume sessions

* Don't wipe vault XMTP DB during restore + add revoke debug button

Don't wipe vault XMTP DB (Documents/):
  Only conversation DBs (AppGroup/) need wiping. The vault DB is
  preserved since it was just used to extract keys, and wiping it
  forces Client.create() which registers a new installation —
  hitting the 10-installation limit during repeated testing.

Add revoke button to debug panel:
  'Revoke all other vault installations' clears accumulated test
  installations from the XMTP network, freeing up slots.
  VaultManager.revokeAllOtherInstallations() delegates to the
  live vault XMTP client.

* Use static XMTP revocation that works without a live client

Client.revokeInstallations is a static method that doesn't need
a live client — just the signing key, inboxId, and API config.
This works even when Client.create() fails due to the 10-install
limit, which is exactly when you need to revoke.

XMTPInstallationRevoker: static utility that fetches all
installations via Client.inboxStatesForInboxIds, filters out
the current one, and revokes the rest via the static method.

* Broadcast all conversation keys to vault before backup

On a single device, vault key-sharing messages (DeviceKeyBundle)
only exist if multi-device pairing has happened. Without them,
the vault archive is empty of keys and restore can't recover
conversation identities.

Fix: call VaultManager.shareAllKeys() before creating the vault
archive. This broadcasts all conversation keys as a DeviceKeyBundle
message to the vault group, ensuring the vault archive always
contains every key needed for restore — regardless of whether
multi-device pairing has occurred.

This is the intended design: the vault is the single source of
truth for conversation keys, and backups should capture that
truth completely.

* Make vault key broadcast and archive creation non-fatal during backup

After 'delete all data', the vault may not be connected yet.
The backup should still succeed — it just won't include the
vault archive or keys. The conversation archives and GRDB
snapshot are still created.

On the next backup after the vault reconnects, the keys will
be broadcast and the vault archive will be included.

* Add plan for inactive conversation mode post-restore

* Resolve design decisions for inactive conversation mode

* Inactive conversation mode, iCloud vault key sync, and restore reliability fixes (#626)

* Update copy: 'comes online' -> 'sends a message'

* Add inactive conversation mode (Phases 1 & 2)

Phase 1 — data model:
- Add isActive column to ConversationLocalState (default true)
- GRDB migration addIsActiveToConversationLocalState
- Surface isActive: Bool on Conversation model via local state join
- Add setActive(_:for:) and markAllConversationsInactive() to
  ConversationLocalStateWriter + protocol
- RestoreManager calls markAllConversationsInactive() after
  importing conversation archives

Phase 2 — reactivation detection:
- After each syncAllConversations call site in SyncingManager
  (initial start, resume, discovery), query for conversations
  where isActive == false
- For each, call conversation.isActive() on the XMTP SDK
  (local MLS state check, no network)
- If true, update GRDB via setActive(true, for:)
- GRDB ValueObservation propagates to UI reactively

Phases 3 & 4 (UI) are blocked on design sign-off.

* Remove local settings from tracked files

* Stop tracking .claude/settings.local.json (already gitignored)

* Fix infinite recursion in createArchive protocol bridge

The protocol method createArchive(path:encryptionKey:) was calling
itself instead of the SDK's createArchive(path:encryptionKey:opts:)
because the compiler resolved the overload back to the protocol.
Explicitly cast to XMTPiOS.Client, same pattern as importArchive.

* Fix infinite recursion in importArchive protocol bridge

The cast to XMTPiOS.Client didn't help because the SDK method has
the exact same signature as the protocol method — Swift resolved
it back to the protocol bridge. Call through ffiClient directly
to bypass the protocol dispatch entirely.

* Remove archive protocol bridge methods causing infinite recursion

XMTPiOS.Client satisfies both createArchive and importArchive
protocol requirements directly via its SDK implementations.
No explicit bridge needed, no recursion risk.

* Remove createArchive/importArchive from XMTPClientProvider protocol

These methods don't belong on the messaging protocol. All callers
in the backup/restore layer use XMTPiOS.Client directly, not the
protocol abstraction. ConvosBackupArchiveProvider.buildClient now
returns Client instead of any XMTPClientProvider.

No wrapper, no recursion, no redundant protocol surface.

* Add diagnostic logging for inactive conversation reactivation

- Remove try? in RestoreManager markAllConversationsInactive to
  surface any errors
- Add per-conversation debug logging in reactivation check to
  trace isActive() return values
- Helps diagnose whether importArchive causes isActive() to
  return true immediately (defeating the inactive flag)

* Reactivate conversations on inbound message, change join copy

1. Reactivation: move from isActive() polling in SyncingManager to
   message-driven reactivation in StreamProcessor. When an inbound
   message arrives for a conversation with isActive=false, flip it
   to active immediately. This avoids the issue where importArchive
   causes isActive() to return true on stale MLS state.

2. Copy: change 'You joined as <name>' to 'Reconnected' for the
   system message shown when the current user's installation is
   re-added to a group.

* Revert 'Reconnected' copy change

The summary in ConversationUpdate has no restore context — changing
it here would affect all self-join scenarios, not just post-restore.
The copy change belongs in Phase 4 (UI layer) where isActive is
available to distinguish reconnection from first join.

* Show 'Reconnected' for post-restore self-join messages

Thread isReconnection flag through the message pipeline:
- DBMessage.Update: add isReconnection (default false, backward
  compatible with existing JSON)
- StreamProcessor: when an inbound message arrives for an inactive
  conversation, set isReconnection=true on the stored GroupUpdated
  message before reactivating the conversation
- ConversationUpdate: surface isReconnection from DB hydration
- summary: return 'Reconnected' instead of 'You joined as X'
  when isReconnection is true

Normal first-time joins are unaffected.

* Fix isReconnection decoding for existing messages

Add custom Decodable init to DBMessage.Update that defaults
isReconnection to false when the key is missing. Existing
GroupUpdated messages in the DB were stored before this field
existed, causing keyNotFound decoding errors and blank message
history.

* Document quickname backup as deferred for v1

* Implement inactive conversation UI (Phases 3 & 4)

Phase 3 — conversations list:
- Show 'Awaiting' label with dot separator in subtitle when
  conversation.isActive is false (same pattern as 'Verifying')
- Suppress unread/muted indicators for inactive conversations

Phase 4 — conversation detail:
- InactiveConversationBanner: pill overlay above composer with
  clock.badge.checkmark.fill icon, 'History restored' title,
  subtext explaining the state, tappable to learn.convos.org
- Intercept send, reaction, and reply actions when inactive —
  show 'Awaiting reconnection' alert instead of executing
- Disable top bar trailing item and text field when inactive
- All UI clears reactively when isActive flips to true via
  GRDB ValueObservation (no restart needed)

* Fix InactiveConversationBanner layout to match Figma design

* Update inactive banner: cloud.fill icon, 'Restored from backup'

- Change icon from clock.badge.checkmark.fill to cloud.fill
- Change title from 'History restored' to 'Restored from backup'
- Move banner from overlay into bottomBarContent so it participates
  in bottom bar height measurement — messages get proper inset and
  don't render behind the banner

* Use colorLava for cloud icon in inactive banner

* Fix reactivation for messages arriving via sync path

Messages can arrive through two paths:
1. Message stream → processMessage → markReconnectionIfNeeded ✓
2. Sync/discovery → processConversation → storeWithLatestMessages

Only path 1 had reactivation. Add reactivateIfNeeded after
storeWithLatestMessages in processConversation so the first
message (typically the GroupUpdated 'You joined') properly
clears the inactive state.

* Mark reconnection messages in sync path + re-adopt profiles after restore

1. reactivateIfNeeded now marks recent GroupUpdated messages as
   isReconnection=true before flipping isActive, so the 'You joined'
   message shows 'Reconnected' even when arriving via sync path.

2. ConversationWriter: when conversation is inactive (post-restore),
   force-apply member profiles from XMTP group metadata to GRDB,
   bypassing the 'existing profile has data' skip. This re-adopts
   display names and avatars from the group's custom metadata after
   restore, preventing 'Somebody' fallback.

* Show spinner only on active action during restore

Track which action is running via activeAction string. Spinner
only shows next to the active button. Hide the Actions section
entirely during restore so only the restore row with its spinner
is visible.

* Fix first message after restore not appearing in history

After restore, conversationWriter.store() calls conversation.sync()
which can fail on stale MLS state (new installation not yet re-added
to the group). This threw an error that prevented messageWriter.store
from ever running, so the first inbound message was lost.

Fix: if conversationWriter.store() fails, fall back to the existing
DBConversation from GRDB (which exists from the backup) and continue
storing the message. The conversation will be properly synced on the
next successful sync cycle.

Also refactor reactivateIfNeeded to extract markRecentUpdatesAsReconnection
for clarity.

* Try all vault keys on restore + detailed debug view

RestoreManager: try decrypting the backup bundle with every
available vault key (local + iCloud) instead of just the first.
This handles the case where device B created its own vault key
before iCloud sync propagated device A's key.

VaultKeySyncDebugView: add detailed sections showing:
- Each vault key with inboxId, clientId, and which stores
  (local/iCloud) it exists in, with green checkmark for active
- iCloud backup files with device name, date, size, inbox count

Makes it easy to diagnose vault key divergence between devices.

* Enable iCloud Keychain sync for vault keys

The iCloud vault keychain store was using
kSecAttrAccessibleAfterFirstUnlock but never set
kSecAttrSynchronizable — so items stayed local on each device
even under the same Apple ID.

Add synchronizable flag to KeychainIdentityStore and set it to
true for the iCloud vault store. This ensures vault keys
propagate between devices via iCloud Keychain, so device B can
decrypt backups created by device A.

Updated all iCloud vault store creation sites:
- ConvosClient+App.swift (production)
- BackupDebugView.swift
- VaultKeySyncDebugView.swift

* Add per-key actions in vault debug view

Each vault key now shows inline buttons:
- 'Delete local' — removes only the local copy for that inboxId
- 'Delete iCloud' — removes only the iCloud copy for that inboxId
- 'Adopt locally' — shown for iCloud-only keys, copies to local

This lets you surgically delete the wrong local vault key and
adopt the correct iCloud one without affecting other keys.

* Allow vault clientId update after restore or cross-device sync

After restore or iCloud key sync, the vault inboxId is the same
but the XMTP installation ID (clientId) is different because each
device gets its own installation. Previously this threw
clientIdMismatch and failed vault bootstrap.

Fix:
- InboxWriter: for vault inboxes, update clientId in GRDB instead
  of throwing when it changes
- VaultManager: re-save vault key to keychain with new clientId
  when it differs from the stored one (not just for vault-pending)

* Fix restore spinner not showing

runAction title was 'Restore' but actionLabel title was
'Restore from backup' — the mismatch meant activeAction
never matched and the spinner never appeared.

* Skip quickname onboarding for restored conversations

After restore, the conversation already has a profile from XMTP
group metadata. But UserDefaults (quickname state) aren't backed
up, so the onboarding coordinator thinks the user never set their
name and shows 'Add your name for this convo'.

Skip startOnboarding() entirely when conversation.isActive is
false — the profile is already restored from metadata.

* Update tests to include isActive and isReconnection parameters

MessagesListProcessorTests and related files need the new
parameters added in this branch:
- ConversationUpdate.isReconnection
- ConversationLocalState.isActive

* Fix delete all data leaving orphaned conversations (#628)

* Fix delete all data leaving orphaned conversations

InboxWriter.deleteAll() only deleted inbox rows — conversations,
messages, members, profiles, and all other tables were left behind.
No foreign key cascade from inbox to conversation exists, so the
data persisted and reappeared on next launch.

Now deletes all rows from every GRDB table in dependency order
within a single write transaction.

* Address code review: fix FK ordering, add test, add logging

- Fix foreign key ordering: conversation_members before invite
  (invite has FK referencing conversation_members)
- Add per-table error logging for debugging failed deletions
- Add testDeleteAllRemovesAllData: creates full data graph
  (inbox, conversation, member, profile, local state,
  conversation_members), calls deleteAll(), verifies all empty

* Address code review feedback on PR #626

Three fixes flagged by Macroscope and Claude review:

1. iCloud Keychain migration (critical)
   Add migrateToSynchronizableIfNeeded() that finds existing
   non-synchronizable items and re-saves them with
   kSecAttrSynchronizable: true. Without this, existing users
   would lose access to vault keys after enabling the
   synchronizable flag (queries with synchronizable:true don't
   match non-synchronizable items).
   Called at app launch from ConvosClient+App before the
   VaultKeyStore is created.

2. CancellationError propagation in StreamProcessor
   processMessage catch block for conversationWriter.store now
   rethrows CancellationError instead of falling through to the
   existing-DBConversation fallback. Prevents runaway work after
   task cancellation.

3. InboxWriter vault clientId update early-return
   When vault clientId changed, we were returning early and
   skipping the installationId update logic. Now we continue
   through to update installationId if the caller provided one.

* Add plan for vault re-creation on restore

After restore, the vault cannot be reactivated the same way
conversations can — the vault is the user's own group with no
third party to trigger MLS re-addition. This plan proposes
creating a fresh vault (new inboxId, new keys) on restore,
renaming the old vault DB as a read-only artifact, and
broadcasting the restored conversation keys to the new vault
so it becomes the source of truth for future device pairs.

* Add VaultManager.reCreate method for post-restore vault rebuild

Tears down the current vault and bootstraps a fresh one with new
keys. Used after restore to replace the inactive restored vault
with a working one.

Flow:
  1. Disconnect current vault client
  2. Delete old vault key from local + iCloud keychain
  3. Delete old DBInbox row for the vault
  4. Re-run bootstrapVault() which generates fresh keys and
     creates a new MLS group

The old vault XMTP database file is left on disk as an orphan
to avoid filesystem risk. A future cleanup pass can remove
orphaned vault DB files.

* Wire vault re-creation into RestoreManager

After marking conversations inactive, call VaultManager.reCreate
to tear down the restored (read-only) vault and bootstrap a new
one with fresh keys. Then broadcast the restored conversation
keys to the new vault via shareAllKeys so future device pairs
can sync through the new vault.

VaultManager is optional on RestoreManager to keep existing tests
working without a live vault. The backup debug view injects it
from the session.

If vault re-creation fails, the restore still completes — the
user has working conversations, just no functional vault. They
can retry or manually re-create later.

* Add comprehensive logging for vault re-creation flow

Tag-prefixed logs across the full re-creation path to make it
easy to verify on device:

- [Vault.reCreate] === START === / === DONE === boundaries
- Before/after snapshot of inboxId, installationId, bootstrap
  state, keychain key count, and GRDB vault inbox row count
- Step-by-step 1/4 through 4/4 markers
- Verification log comparing old vs new inboxId (warns loudly
  if they match — indicates failed re-creation)
- [Vault.bootstrap] logs XMTP connect, identity transition,
  keychain persist, and GRDB save
- [Vault.loadOrCreateVaultIdentity] logs whether existing key
  was found or fresh keys generated
- [VaultKeyCoordinator.shareAllKeys] logs row count, packaged
  key count, missing identities, and send success
- [Restore.reCreateVault] mirrors the same pattern on the
  restore side

* Update plans: revoke old vault in reCreate + stale device detection

- vault-re-creation: add step 8a to revoke device A's installation
  from the old vault before disconnecting, mirroring the per-
  conversation revocation that already happens. Steps re-lettered
  8a through 8g.
- stale-device-detection: new plan for detecting when the current
  device has been revoked (via client.inboxState refresh from
  network) and showing a 'device replaced' banner with a delete-
  data CTA. Covers detection triggers, storage, and UI.

* Revoke old vault installations before re-creating

Step 1/5 of VaultManager.reCreate now calls
revokeAllOtherInstallations on the old vault before disconnecting,
mirroring the per-conversation revocation that already happens
on restored conversation inboxes.

This kills device A's zombie vault installation so the old vault
is fully decommissioned (not just orphaned). Combined with the
conversation-level revocations, device A is now completely
replaced after restore.

Failure is non-fatal — logged as a warning, re-creation
continues. Worst case, device A keeps a zombie vault installation,
but the new vault on device B still works.

* Add isStale column to DBInbox with migration

Foundation for stale device detection: a per-inbox flag
indicating the current device's installation has been revoked
(e.g., because the user restored on a different device).
Defaults to false, nullable migration for existing rows.

* Add stale installation detection in InboxStateMachine

After inbox reaches ready state, call client.isInstallationActive
which queries XMTP inboxState(refreshFromNetwork: true) and checks
if our installationId is still in the active installations list.

If not → device was revoked (typically because user restored on
another device) → mark the inbox as stale in GRDB, emit QA event.

New XMTPClientProvider method isInstallationActive() wraps the
SDK call and returns Bool for clarity.

Added 'inbox' category to QAEvent for the stale_detected event.

* Also check stale installation on app foreground

When the app returns to foreground, inboxes might have been
revoked on another device (e.g., user restored on device B
while device A was backgrounded). Re-run isInstallationActive
check in handleEnterForeground to catch this on return.

* Surface isStale on Inbox model + reactive publisher

- Add isStale to Inbox domain model and DBInbox hydration
- Add InboxesRepository.anyInboxStalePublisher that emits true
  when any non-vault inbox has isStale = true
- Subscribe in ConversationsViewModel, expose isDeviceStale
  property for the view layer to bind to

* Add StaleDeviceBanner UI + hide stale conversations

- New StaleDeviceBanner view: exclamation icon, title, explanation,
  destructive 'Delete data and start fresh' button (wires into
  existing deleteAllData flow)
- Banner inserted above conversations list via safeAreaInset(edge:
  .top) when viewModel.isDeviceStale is true
- ConversationsViewModel: subscribe to staleInboxIdsPublisher and
  filter out any conversations belonging to stale inboxes, both
  on initial load and on reactive updates from GRDB
- Inline the isInstallationActive check in handleStart /
  handleEnterForeground to avoid Sendable issues passing
  any XMTPClientProvider across actor boundaries

* Fix synchronizable migration data loss risk

Macroscope flagged the migration as losing data if SecItemAdd
fails after the original item is deleted. Reverse the order:
add the synchronizable copy first, then delete the
non-synchronizable original — matching the pattern in
migrateItemToPlainAccessibility. This ensures the data is
never left unrecoverable.

If add fails: original is intact, retry next launch.
If delete fails after successful add: synchronizable copy is
in place, retry next launch (idempotent — the second add will
succeed with errSecDuplicateItem).

* Fail VaultManager.reCreate when old key deletion fails or inboxId matches

Two related fixes flagged by Macroscope:

1. If vaultKeyStore.delete(inboxId:) throws during step 3/5,
   we previously logged a warning and continued. The next
   bootstrap then found the old identity and reused it,
   producing the same inboxId — meaning no new vault was
   actually created. Now throws VaultReCreateError.bootstrapFailed.

2. If after bootstrap the new inboxId matches the old one
   (because deletion didn't actually remove the key), throw
   instead of just logging a warning. The function previously
   returned successfully despite re-creation having failed.

Both errors propagate up to RestoreManager.reCreateVault which
already catches and logs without aborting the restore — the
user can still see their conversations, just with no functional
vault, and can retry.

* decryptBundle: don't let staging dir reset failure escape the loop

If FileManager.createDirectory throws (e.g. disk full) between
key attempts, the error escaped the for loop and we surfaced
that confusing error instead of RestoreError.decryptionFailed.
Catch and log so the loop continues to the next key.

* shareAllKeys: only mark successfully shared inboxes

When identityStore.identity(for:) failed for an inbox, it was
skipped from the keys array but still marked sharedToVault=1
in the database. This caused inboxes whose keys were never
actually shared to be flagged as shared, preventing future
sync attempts.

Now we mark only the inboxIds that ended up in the keys array.

* Clear conversation selection when its inbox becomes stale

When staleInboxIdsPublisher fires and removes conversations from
the visible list, clear _selectedConversationId if the previously
selected conversation is no longer present. Otherwise the detail
view shows a conversation that no longer appears in the list.

Also fix the existing conversationsPublisher selection check to
inspect the filtered list (self.conversations) rather than the
unfiltered parameter, so stale selections are properly cleared
on either trigger.

* Fix vault re-create orphan rows + add inactive mode tests

Macroscope flagged a real bug at VaultManager:317: when oldInboxId
is nil, step 3 deletes all vault keys from the keychain but step 4
skipped DBInbox cleanup. The next bootstrapVault would fail with
InboxWriterError.duplicateVault. Now step 4 deletes all vault rows
when oldInboxId is nil, matching the keychain cleanup behavior.
Also escalated step 4 errors to throw VaultReCreateError.

Test coverage for previously-untested inactive mode logic:
- ConversationLocalStateWriterTests (5 tests):
  setActive toggle, scoped to one conversation, throws on
  unknown conversation, markAllConversationsInactive bulk
  flip, no-op on empty database
- InboxWriterTests adds 2 tests for markStale (toggle + scoping)
- DBMessageUpdateDecodingTests (4 tests):
  legacy JSON without isReconnection defaults to false,
  decode with isReconnection=true preserves value, encode
  round-trip, memberwise init default

46/46 tests pass in the affected suites.

* Refine stale-device UX and update recovery policy plan

* Add StaleDeviceState enum + reactive publisher

Derives healthy/partialStale/fullStale from the set of used
non-vault inboxes and how many are flagged isStale=true:
- healthy: no stale inboxes (or no used inboxes)
- partialStale: some but not all used inboxes are stale
- fullStale: every used inbox is stale (device effectively dead)

Drives the partial-vs-full UX policy described in
docs/plans/stale-device-detection.md. Existing
anyInboxStalePublisher and staleInboxIdsPublisher are kept
for backward compatibility.

* ConversationsViewModel: derive partial vs full stale state

Replace single isDeviceStale boolean with StaleDeviceState enum
driving:
- isDeviceStale: any inbox stale (banner visibility)
- isFullStale: every inbox stale
- canStartOrJoinConversations: false only when fullStale (was
  blocking new convos in any stale state previously — now allows
  them in healthy inboxes during partial stale)

Add handleStaleStateTransition that logs every transition and
schedules an auto-reset on healthy/partial → fullStale via
isPendingFullStaleAutoReset. The view layer binds to this flag
to render a cancellable countdown UI.

Add cancelFullStaleAutoReset / confirmFullStaleAutoReset for the
UI to call from the countdown.

* Stale device UI: partial vs full copy, countdown auto-reset

StaleDeviceBanner now takes a StaleDeviceState and renders
distinct copy + title for partial vs full:
- Partial: 'Some conversations moved to another device' / 'Some
  inboxes have been restored on another device. Resetting will
  clear local data...'
- Full: 'This device has been replaced' / 'Your account has been
  restored on another device. Resetting will clear local data...'

Action button renamed from 'Delete data and start fresh' to
'Reset device' so the destructive intent is in the verb itself
(per plan: 'Continue' was misleading).

StaleDeviceInfoView (Learn more sheet) now also varies copy by
state with two distinct title + paragraph sets.

New FullStaleAutoResetCountdown component shown via
fullScreenCover when ConversationsViewModel.isPendingFullStaleAutoReset
fires. 5-second countdown with Cancel and 'Reset now' actions.
The countdown task auto-fires onReset when remaining hits 0.

Wired into ConversationsView so when staleDeviceState transitions
to fullStale, the cover automatically presents.

* Add tests for StaleDeviceState derivation

8 tests covering the partial-vs-full state matrix:
- healthy when no inboxes
- healthy when inbox has no conversations
- healthy when used inbox is not stale
- partialStale when one of two used inboxes is stale
- fullStale when every used inbox is stale
- vault inbox is excluded from the count
- unused conversations don't make an inbox 'used'
- StaleDeviceState convenience flags (hasUsableInboxes,
  hasAnyStaleInboxes)

54/54 tests pass in affected suites.

* Vault bootstrap: log when installation is stale (diagnostic only)

After bootstrap, query inboxState(refreshFromNetwork: true) and
log a warning + QAEvent if our installationId isn't in the active
list. This means another device restored from backup and revoked
this vault installation.

The conversation-level stale detection (InboxStateMachine) is
what drives user-facing recovery — when a device is fully
replaced, all conversation inboxes go stale and the auto-reset
fires before vault state matters in practice.

This log is purely for diagnosing rare partial-revocation states
during QA. We don't add user-facing UI for vault staleness
because:
- in full-stale, conversation-level reset wipes the device anyway
- in partial-stale, the user isn't trying to use vault-dependent
  features in normal flows
- the only user-visible impact is 'pair new device' silently
  failing, which is rare and easily explained

* Don't dismiss compose flow on partial stale inbox updates

The staleInboxIdsPublisher sink was unconditionally nilling
newConversationViewModel whenever isDeviceStale was true, which
returns true for both partialStale and fullStale states. This
contradicted handleStaleStateTransition which explicitly keeps
the compose flow alive during partial stale so the user can
finish creating a conversation in a still-healthy inbox.

Remove the redundant check entirely — handleStaleStateTransition
is the single source of truth for closing in-flight flows on
transition to fullStale. The sink now only handles visible-list
filtering and selection clearing.

* Address Macroscope feedback on PR #618

Two low-severity findings on the restore flow:

1. VaultKeyStore.deleteLocal: the non-iCloud fallback called
   store.delete(inboxId:) which permanently deletes the key,
   contradicting the documented purpose of 'preserving the
   iCloud copy'. In production we always use ICloudIdentityStore
   so this only mattered in tests/mocks, but the contract was
   wrong. Now it's a no-op with a warning when the underlying
   store isn't iCloud-backed; tests that need to verify deletion
   should call delete(inboxId:) directly.

2. MockDatabaseManager.replaceDatabase didn't run migrations
   after restoring the database, unlike DatabaseManager which
   calls SharedDatabaseMigrator.shared.migrate after restore.
   This caused tests to pass while production could fail on
   schema mismatches. Now runs the migrator post-restore to
   match production behavior.

* BackupBundle: harden path traversal check against symlinks

Macroscope flagged that the existing standardizedFileURL check
doesn't resolve symlinks, but fileData.write(to:) follows them.
A pre-existing symlink under the staging dir could escape the
first-pass check.

Fix: after creating the parent directory, re-validate with
.resolvingSymlinksInPath() on the parent. If the resolved path
is no longer inside the staging dir (because a symlink redirected
it), throw 'path traversal via symlink'.

* RestoreManager: throw on missing vault archive

Macroscope flagged that importVaultArchive returned an empty
array when the vault archive was missing from the bundle, but
restoreFromBackup would still continue to wipe local XMTP state
and replace the database — silent data loss with no keys to
decrypt the resulting conversations.

The RestoreError.missingVaultArchive case was already defined
but never thrown. Throw it now before any destructive operation.

* RestoreManager: stop logging success after a try? swallowed an error

wipeLocalXMTPState was logging 'cleared conversation keychain
identities' unconditionally after try? await identityStore.deleteAll().
The success message appeared even when deleteAll threw and no
identities were actually cleared, hiding restore failures during
debugging.

Convert to do/catch and log a warning on failure (non-fatal —
restore continues either way).

* DatabaseManager: surface rollback failure as a dedicated error

Macroscope flagged that if the restore failed AND the rollback
also failed (e.g. migration threw on the rolled-back snapshot),
the catch block swallowed the rollback error and re-threw only
the original. Caller had no way to know the DB was now in a
potentially-inconsistent state.

Add DatabaseManagerError.rollbackFailed wrapping both the
original and rollback errors. Throw it instead of the original
when the rollback path itself throws — caller can distinguish
'restore failed cleanly' from 'restore failed and we couldn't
recover'.

* Update test for missing-vault-archive: now expects abort

The 'missing vault archive is non-fatal' test was asserting the
old behavior. Updated to assert the restore aborts before any
destructive operation runs and the conversation archive importer
is never invoked.

* Inbox: decode isStale with backward-compatible fallback

Macroscope and Claude flagged that adding isStale as a non-optional
stored property broke Codable backwards compatibility — Swift's
synthesized Decodable ignores default init parameter values, so any
JSON payload missing 'isStale' would fail to decode with keyNotFound.

Inbox is currently only hydrated from GRDB columns (not JSON), so
there's no active bug, but the Codable conformance is a footgun for
any future JSON path (cache, push payload, etc).

Add a custom init(from:) that uses decodeIfPresent for isStale with
a 'false' fallback. No API change.

* VaultManager.bootstrap: save-first-then-delete on keychain update

Macroscope and Claude flagged the delete-then-save-with-try? pattern
when the vault identity is being persisted/updated. If the delete
succeeded but the save failed (keychain access denied, etc), the
vault key was permanently lost — the next bootstrap would generate
a new inboxId and InboxWriter.save would throw duplicateVault
because the old GRDB row still existed.

Fix: save the new/updated entry first (which will throw on failure
and abort the bootstrap with the old key still intact), then delete
the old entry only if save succeeded. Delete failures are now
non-fatal because the vault is already recoverable from the new
entry — they just leave an orphan keychain item that gets cleaned
up on the next cycle.

Mirrors the same add-first-then-delete pattern we use for the
iCloud Keychain synchronizable migration.

* ConversationsViewModel: recompute visible list from unfiltered source

Macroscope and Claude flagged a real UX bug: when an inbox recovers
from stale back to healthy, its conversations remained permanently
hidden. The staleInboxIdsPublisher sink was filtering
self.conversations in-place with
  self.conversations = self.conversations.filter { ... }
which destroyed the source list. When a smaller set of stale IDs
later emitted, the filter ran against an already-filtered list and
the recovered conversations never reappeared. conversationsPublisher
observes a different table and only re-emits on DB change, so
nothing triggered re-hydration.

Fix: cache the publisher's output in unfilteredConversations and
derive the visible list via recomputeVisibleConversations(), called
from both the stale-ids sink and the conversations sink. Also call
it after hiddenConversationIds.remove so post-explosion rollback
behaves correctly.

* StreamProcessor: handle CancellationError explicitly in processMessage

Macroscope flagged that 'throw CancellationError()' inside the inner
do/catch for conversationWriter.store was immediately caught by the
outer generic 'catch {}' and logged as an error rather than
propagated — defeating cooperative cancellation.

Since processMessage is declared 'async' (not 'async throws') and
changing the signature would ripple across the StreamProcessor
protocol and six call sites, handle it locally: add a dedicated
'catch is CancellationError' arm to both do/catch blocks that logs
at debug level and returns early.

The enclosing runMessageStream loop calls Task.checkCancellation()
at the top of every iteration, so cancellation still exits the
stream task — just one message later than ideal, which is
acceptable for cooperative cancellation semantics.

* ConversationsViewModel: seed unfilteredConversations and fix explode ordering

Two follow-up bugs from the unfilteredConversations refactor that
Macroscope caught:

1. (Medium) init populated self.conversations from fetchAll() but
   left unfilteredConversations empty. When observe() subscribed to
   staleInboxIdsPublisher, GRDB's ValueObservation fired immediately,
   triggering recomputeVisibleConversations against an empty source
   and wiping the initial list. Seed both lists from the fetch.

2. (Low) explodeConversation was mutating self.conversations in
   place with removeAll, bypassing the unfiltered cache, and on
   success the conversation would briefly resurface because
   recomputeVisibleConversations filtered an unfiltered list that
   still contained it.

   Rework to the optimistic-hide pattern: insert into
   hiddenConversationIds and recompute (hides immediately), then on
   success drop from unfilteredConversations BEFORE clearing the
   hide marker (so the recompute doesn't resurface it), or on
   failure just clear the hide marker to bring it back.

* Restore reliability: rollback, revocation, and safety guards (#694)

* docs: add backup/restore follow-ups plan for deferred items

* BackupManager: make vault archive creation fatal

createVaultArchive was swallowing errors and allowing backups to
complete without a vault archive. Since the vault archive is the
only source of conversation keys during restore, a backup without
it is silently unrestorable — the user thinks they have a backup
but it contains nothing to decrypt the restored conversations.

broadcastKeysToVault was already fatal (throws BackupError), but
createVaultArchive (which exports the XMTP DB containing those
keys) was non-fatal. This mismatch meant: keys get sent to the
vau…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant