Skip to content

Allow empty custom-dictionary replacements to delete text#513

Open
bruncanepa wants to merge 1 commit into
altic-dev:mainfrom
bruncanepa:fix/dictionary-empty-replacements
Open

Allow empty custom-dictionary replacements to delete text#513
bruncanepa wants to merge 1 commit into
altic-dev:mainfrom
bruncanepa:fix/dictionary-empty-replacements

Conversation

@bruncanepa

Copy link
Copy Markdown

What & why

The Custom Dictionary can currently only rewrite text — every add/edit path requires a non-empty replacement. This PR lets a user map a trigger to an empty replacement so the matched text is simply removed from the transcript.

Motivating use case: removing the Spanish inverted marks ¿ / ¡ (map ¿ → empty) so they never appear in output.

The two blockers

  1. Validation. The Manual Add composer (canAddManualReplacement) and EditDictionaryEntrySheet.canSave both required a non-empty replacement. They now require only at least one (non-duplicate) trigger.

  2. A latent matching bug. ASRService.rebuildDictionaryCache compiled every trigger as \b<trigger>\b. ICU \b word boundaries never match around punctuation, so a punctuation-only trigger such as ¿ matched nothing — meaning even after relaxing validation, ¿ → (empty) would silently do nothing. Boundaries are now applied conditionally: a leading \b only when the trigger's first character is a word character, a trailing \b only when its last is.

    • Word triggers (cant, fluid voice) keep the exact current \b…\b behavior — no regression.
    • Punctuation triggers (¿, ¡, ...) now match and can be replaced/removed.
    • Only previously-dead punctuation entries change behavior; nothing that worked before is affected.

An empty replacement already produces an empty escapedTemplate, i.e. deletion, so no change to the apply loop was needed.

Scope

Deliberately minimal — three focused edits. Left untouched: the whitespace around a removed standalone word (literal delete), the LocalAPI replacement validation, and the voice-training merge path (training teaches a spelling, so an empty target stays disallowed there).

Verification

Validated the matching logic against the real NSRegularExpression/ICU engine:

Input Rule Output
¿Cómo estás? ¿ → ∅ Cómo estás?
Hola ¿cómo? ¡genial! ¿,¡ → ∅ Hola cómo? genial!
I use fluid voice daily fluid voiceFluidVoice I use FluidVoice daily
I cant cantina cant → ∅ I cantina (boundary respected, cantina untouched)

Note: I couldn't run a full app build in my environment (local Xcode toolchain issue unrelated to the change), so a maintainer build/CI check is welcome.

Users can now map a trigger to an empty replacement to remove it from
the transcript (e.g. the Spanish inverted marks "¿"/"¡").

Two things blocked this:

- Validation in the Manual Add composer and the Edit sheet required a
  non-empty replacement. Both now only require at least one trigger.
- More subtly, rebuildDictionaryCache wrapped every trigger in
  `\b…\b`. ICU word boundaries never match around punctuation, so a
  punctuation-only trigger like "¿" matched nothing even once empty
  replacements were allowed. Boundaries are now applied conditionally,
  only where the trigger edge is a word character, so punctuation
  triggers match while word triggers keep their existing behavior.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 802072ca93

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".


private var canAddManualReplacement: Bool {
// An empty replacement is allowed: it deletes the trigger from the transcript.
!self.manualTriggers.isEmpty &&

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve deletion rules in dictionary exports

When this new path allows a user to save trigger -> "", the entry works locally but is silently dropped by the dictionary transfer path: makeExportDocument() uses compactMap(Self.exportReplacement(from:)), and DictionaryTransferService.exportReplacement returns nil when to.isEmpty (lines 240-329). A user who exports/imports or migrates their dictionary after adding a deletion rule like ¿ -> ∅ will lose that rule, so the new feature is not reliably portable.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant