how to lengthen the pause between the words with text-to-speech (pyTTS or SAPI5)

Berry Tsakala picture Berry Tsakala · Nov 13, 2010 · Viewed 11.8k times · Source

Is it possible to extend the gap between spoken words when using text to speech with SAPI5 ?

The problem is that esp. with some voices, the words are almost connected to each other, which makes the speech more difficult to understand.

I'm using python and pyTTS module (on windows, since it's using SAPI)

I tried to hook to the OnWord event and add a time.sleep() or tts.Pause(), but apparently even though all the events are caught, they are being processed only at the end of the spoken text, whether i'm using the sync or async flag.

In this NON WORKING example, the sleep() method is executed only after the sentence is spoken:

tts = pyTTS.Create()
def f(x):
    tts.Pause()
    sleep(0.5)
    tts.Resume()

tts.OnWord = f
tts.Speak(text)

Edit: -- accepted solutions

The actual answers for me were either

  • saying each word in its own "speak" command, (suggested by @Lennart Regebro), or
  • replacing each space with a comma, (as mentioned by @Dawson), e.g.

    text = text.replace(" ", ",")

that sets a reasonable pause. I didn't investigate the Pause method more then i mentioned above, since' i'm happy with the accepted solutions.

Answer

Lennart Regebro picture Lennart Regebro · Feb 2, 2011

I don't have any great solutions here. But:

PyTTS last release was in 2007, and there seems to be no documentation. The same people now maintains a cross-platform library, called pyttsx, which also supports SAPI. It has a words per minute setting, but no setting to increase the pause between the words. This is most likely because there is no pause between the words at all.

You can insert a long pause by making each word it's own "utterance".

engine.say('The')
engine.say('quick')
engine.say('brown')
engine.say('fox.')

instead of

engine.say('The quick brown fox."

But that probably is too long. Other than that, you probably have to wrap or subclass the SAPI driver, but I'm not 100% sure that's going to work either. People don't have pauses between words, so I'm not sure that the speech engines themselves support it.