Reputation: 21
I'm trying to implement speech synthesis in-browser for a home automation project for visually-impaired people. In my test page I've noticed there's a roughly 1-second lag between calling the speak()
method and actually hearing it.
Just wondering whether this is normal behaviour or if I'm doing something wrong. If anyone can offer advice on how to speed it up (even by half a second or so) I'd really appreciate it :)
[EDIT 1]
Okay, so I've tried my test page in MS Edge (had only been using Chrome) and the lag disappears. I also tried the Web Speech Synthesis Demo in Chrome, with "voice" set to "native" and there was no lag either. Both these tests rendered the text with a UK English voice.
In Chrome, my test page renders the text with an Australian-English voice (I'm in AU), and has a lag before playing.
My gut tells me that Chrome browser is loading a voice from some remote location instead of using the local system voice, and only for this specific page (ie the demo at codepen.io works fine in the same browser). But what I don't know is why.
This wouldn't be so much of a problem if it only loaded the voice once rather than every single time it is called (I'm just presuming that's what's happening).
[/EDIT 1]
Here's my code:
<body>
<div class='col col-xs-6'>
<div style='width:100%;'>
<button type='button' class='btn' onmouseover='speak("mouse over");' onmouseout="cancel();">
Test button.
</button>
</div>
</div>
<p id="msg"></p>
<script type="text/javascript">
var globalVolume = 0.8;
var globalRate = 1;
var globalPitch = 0.9;
var enterMsg = "Mouse over";
function speak(text) {
var msg = new SpeechSynthesisUtterance();
msg.text = text;
msg.volume = globalVolume;
msg.rate = globalRate;
msg.pitch = globalPitch;
//msg.voice = "native";
window.speechSynthesis.speak(msg);
}
function cancel() {
window.speechSynthesis.cancel();
}
//speak("Hello, world!");
</script>
</body>
Upvotes: 2
Views: 1086
Reputation: 12402
As you were theorizing in your update, if you choose any of the non-native voices (any of the ones that start with "Google") the sound is generated on a Google server and then sent to the browser, thus causing the delay. It isn't actually loading a voice into your browser, every time you try to use the TTS it sends it to a server to generate the sound. So, unfortunately the network delay will always be there when using anything but the native voices available on your computer. Aside from the delay, there is also the privacy concern of all the speech you generate being sent to Google (and possibly the NSA or anyone else who is spying on their servers). Edge and Firefox use the native voices by default and don't let you choose Google's proprietary ones which is why they always lack the delay.
Upvotes: 2