Sound Glitches During Realtime Synthesis – Audio Stops and Starts Abruptly

Hello Community,


When using the CoquiEngine in RealtimeTTS for realtime text-to-speech, the synthesized audio is choppy. The sound intermittently stops between words or segments, resulting in a disruptive playback experience.

Steps to Reproduce:

1. Use a realtime TTS script (see below) that feeds text either in small chunks or via a generator.
2. Run the script on a system with GPU support (e.g., RTX 4060 with CUDA enabled).
3.  Observe that during playback, the audio “comes and goes” with noticeable gaps between words or sentences.

Expected Behavior:
The synthesized speech should be continuous and smooth, without abrupt pauses or intermittent glitches.

Actual Behavior:
The playback is discontinuous—audio frequently stops and then resumes, causing a choppy, glitchy experience

This is my code i am using:
```

import os
import time
import torch
from RealtimeTTS import TextToAudioStream, CoquiEngine

def realtime_text_generator():
    texts = [
        "Hello, this is real-time TTS speaking. ",
        "Every sentence is synthesized as soon as it is ready. ",
        "The voice is generated using a local, neural cloned model. "
    ]
    for text in texts:
        yield text
        time.sleep(0.1)  # simulate continuous input with a short delay

if __name__ == "__main__":
    device = "cuda" if torch.cuda.is_available() else "cpu"
    print(f"Using device: {device}")

    # Optionally, specify custom model parameters via environment variables.
    custom_model_path = os.getenv("CUSTOM_COQUI_MODEL_PATH", None)
    custom_model_name = os.getenv("CUSTOM_COQUI_MODEL_NAME", None)

    if custom_model_path:
        print(f"Using custom model from: {custom_model_path}")
        engine = CoquiEngine(
            local_models_path=custom_model_path,
            specific_model=custom_model_name,
            full_sentences=True
        )
    else:
        print("Using default model settings.")
        engine = CoquiEngine()

    stream = TextToAudioStream(engine)
    print("Starting realtime TTS streaming...")
    stream.feed(realtime_text_generator()).play(log_synthesized_text=True)

    while stream.is_playing():
        time.sleep(0.05)

    print("Playback finished.")
    engine.shutdown()
```

Please someone help me to solve this issue.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sound Glitches During Realtime Synthesis – Audio Stops and Starts Abruptly #272

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Sound Glitches During Realtime Synthesis – Audio Stops and Starts Abruptly #272

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions