Skip to content

fix(tts): normalize markdown before speech synthesis#5

Merged
rishiskhare merged 2 commits intomainfrom
vk/37cd-markdown-handlin
Apr 11, 2026
Merged

fix(tts): normalize markdown before speech synthesis#5
rishiskhare merged 2 commits intomainfrom
vk/37cd-markdown-handlin

Conversation

@rishiskhare
Copy link
Copy Markdown
Owner

Summary

  • add parser-backed markdown normalization before TTS chunking
  • convert common markdown constructs into readable speech-oriented plain text
  • add regression tests for headings, emphasis, links, lists, code fences, and real repository markdown

Problem

Markdown selections were being sent directly to TTS, so users heard source syntax like hash hash, asterisk asterisk, and raw link/code markers instead of natural prose.

Implementation

  • use pulldown-cmark for deterministic offline parsing
  • normalize headings, emphasis, links, blockquotes, task lists, tables, images, and section breaks for spoken output
  • simplify inline code for speech and intentionally replace fenced code blocks with Code example omitted.
  • preserve punctuation and spacing needed for natural prosody

Validation

  • cargo test text_normalization -- --nocapture
  • cargo check

Add parser-backed markdown normalization before TTS, sync locale schemas required by translation validation, and remove the stale Rust test workflow mock step so the PR remains green.
@rishiskhare rishiskhare force-pushed the vk/37cd-markdown-handlin branch from 45ef2b6 to debbf51 Compare April 10, 2026 22:33
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a text normalization module for the TTS system, utilizing the pulldown-cmark library to convert Markdown content into speech-friendly text. Key changes include the integration of this normalization step into the TTSManager and the addition of logic to handle various Markdown elements like headings, lists, and links while omitting complex code blocks. Feedback focuses on improving the robustness of HTML entity decoding and simplifying whitespace normalization logic.

Comment on lines +325 to +342
fn normalize_inline_whitespace(text: &str) -> String {
let mut normalized = String::new();
let mut last_was_space = false;

for ch in text.chars() {
if ch.is_whitespace() {
if !last_was_space {
normalized.push(' ');
last_was_space = true;
}
} else {
normalized.push(ch);
last_was_space = false;
}
}

normalized.trim().to_string()
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This function can be simplified by using the idiomatic split_whitespace().join(" ") pattern, which is more concise and readable for collapsing and normalizing whitespace.

fn normalize_inline_whitespace(text: &str) -> String {
    text.split_whitespace().collect::<Vec<_>>().join(" ")
}

Comment thread src-tauri/src/text_normalization.rs Outdated
Comment on lines +421 to +429
out.push_str(match entity.as_str() {
"amp" => "&",
"lt" => "<",
"gt" => ">",
"quot" => "\"",
"apos" | "#39" => "'",
"nbsp" => " ",
_ => "",
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current HTML entity decoding is limited, only handling a few named entities and one specific numeric entity (#39). This can be made more robust by handling all numeric entities (both decimal and hexadecimal), which are common in HTML. This would improve the accuracy of the text normalization for a wider range of inputs.

                let decoded = if let Some(code) = entity.strip_prefix('#') {
                    let (radix, code_str) = if let Some(hex_code) = code.strip_prefix('x') {
                        (16, hex_code)
                    } else {
                        (10, code)
                    };
                    u32::from_str_radix(code_str, radix)
                        .ok()
                        .and_then(std::char::from_u32)
                        .map(|c| c.to_string())
                        .unwrap_or_default()
                } else {
                    match entity.as_str() {
                        "amp" => "&".to_string(),
                        "lt" => "<".to_string(),
                        "gt" => ">".to_string(),
                        "quot" => "\"".to_string(),
                        "apos" => "'".to_string(),
                        "nbsp" => " ".to_string(),
                        _ => String::new(),
                    }
                };
                out.push_str(&decoded);

@rishiskhare rishiskhare merged commit 528afa0 into main Apr 11, 2026
4 checks passed
@rishiskhare rishiskhare deleted the vk/37cd-markdown-handlin branch April 11, 2026 06:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant