Àlex Torregrosa
Àlex Torregrosa

Reputation: 33

Special characters get truncated when using bing tts api from python

I have modified the python example found at https://github.com/Microsoft/Cognitive-Speech-TTS/tree/master/Samples-Http/Python to synthesize voice in spanish changing

"<speak version='1.0' xml:lang='en-us'><voice xml:lang='en-us' xml:gender='Female' name='Microsoft Server Speech Text to Speech Voice (en-US, ZiraRUS)'>

to

"<speak version='1.0' xml:lang='es-ES'><voice xml:lang='es-ES' xml:gender='Male' name='Microsoft Server Speech Text to Speech Voice (es-ES, Pablo, Apollo)'>

but during the synthesis process non-ASCII characters like 'ñ' get truncated at some step, so they don't appear in the final audio file.

I have checked that it's not a python problem by printing the request string, and characters appear correctly.

Upvotes: 0

Views: 876

Answers (1)

cthrash
cthrash

Reputation: 2973

If you look at the HTTP request, you will see that the http.client library does not encode the string correctly. The easiest workaround is to encode it yourself:

ssml = "<speak version='1.0' xml:lang='es-ES'>...</speak>"
body = ssml.encode('utf8')

Upvotes: 1

Related Questions