IgnacioHR
IgnacioHR

Reputation: 586

Mixing languages in the same SSML

If I send this small piece of SSML to the speech processor I get two voices

<speak version='1.0' xml:lang='es-ES'>
  <voice xml:lang='es-ES' xml:gender='Male' name='Microsoft Server Speech Text to Speech Voice (es-ES, Pablo, Apollo)'>
    <p>
        <s>Hola </s>
        <s xml:lang='en'>Hello</s>
        <s>¿Cómo estas?.</s>
    </p>
  </voice>
</speak>

A man in Spanish and a woman in English. Is this a limitation of the Project Oxford Text to Speech engine? in other words, I would expect the same voice to speak several languages but it looks like this is not the case.

Upvotes: 3

Views: 1795

Answers (1)

cthrash
cthrash

Reputation: 2973

To quote the SSML spec,

Specifying xml:lang does not imply a change in voice, though this may indeed occur. When a given voice is unable to speak content in the indicated language, a new voice may be selected by the processor.

While the current fallback behavior leaves something to desire, the recommendation is to create multiple voice nodes and pick a voice more explicitly when switching languages.

Upvotes: 1

Related Questions