Reputation: 586
If I send this small piece of SSML to the speech processor I get two voices
<speak version='1.0' xml:lang='es-ES'>
<voice xml:lang='es-ES' xml:gender='Male' name='Microsoft Server Speech Text to Speech Voice (es-ES, Pablo, Apollo)'>
<p>
<s>Hola </s>
<s xml:lang='en'>Hello</s>
<s>¿Cómo estas?.</s>
</p>
</voice>
</speak>
A man in Spanish and a woman in English. Is this a limitation of the Project Oxford Text to Speech engine? in other words, I would expect the same voice to speak several languages but it looks like this is not the case.
Upvotes: 3
Views: 1795
Reputation: 2973
To quote the SSML spec,
Specifying xml:lang does not imply a change in voice, though this may indeed occur. When a given voice is unable to speak content in the indicated language, a new voice may be selected by the processor.
While the current fallback behavior leaves something to desire, the recommendation is to create multiple voice nodes and pick a voice more explicitly when switching languages.
Upvotes: 1