Reputation: 33
I have modified the python example found at https://github.com/Microsoft/Cognitive-Speech-TTS/tree/master/Samples-Http/Python to synthesize voice in spanish changing
"<speak version='1.0' xml:lang='en-us'><voice xml:lang='en-us' xml:gender='Female' name='Microsoft Server Speech Text to Speech Voice (en-US, ZiraRUS)'>
to
"<speak version='1.0' xml:lang='es-ES'><voice xml:lang='es-ES' xml:gender='Male' name='Microsoft Server Speech Text to Speech Voice (es-ES, Pablo, Apollo)'>
but during the synthesis process non-ASCII characters like 'ñ' get truncated at some step, so they don't appear in the final audio file.
I have checked that it's not a python problem by printing the request string, and characters appear correctly.
Upvotes: 0
Views: 876
Reputation: 2973
If you look at the HTTP request, you will see that the http.client library does not encode the string correctly. The easiest workaround is to encode it yourself:
ssml = "<speak version='1.0' xml:lang='es-ES'>...</speak>"
body = ssml.encode('utf8')
Upvotes: 1