Azure Text To Speech (TTS) fails with HTTP 400 Error on Japanese Characters

Question

I’m a student using Python to access the REST API for the TTS Azure Cognitive Service. I’m testing a file created with Azure’s online Audio Content Creation Tool with the following (a Japanese translation of I’m excited to try text to speech):

テキスト読み上げを試すのが楽しみです

The file works when run in the tool, but fails with a HTTP 400 Error when I use the REST API. However, when I use the following text as substitute for the Japanese Kanji, it also works on the REST API: Kisuto yomiage o tamesu no ga tanoshimidesu

Here is a snippet of the code I use:

def send_text2(token,endpoint,uuid,ssml):
    headers = {
        'Authorization': 'Bearer '  + token,
        'Content-Type': 'application/ssml+xml',
        'X-Microsoft-OutputFormat': 'audio-48khz-96kbitrate-mono-mp3',
        'User-Agent': 'Application for Final',
        'X-ClientTraceId': uuid

    }
    vocalize_request = requests.post(endpoint, headers=headers, data=ssml)
    print(vocalize_request.status_code)
    # Retrieve mp3 data and write to file
    with open(f"vocalized_file-{uuid}.mp3", "wb") as binary_file:
    #Write bytes to file
        binary_file.write(vocalize_request.content)

I have tried this code successfully on then original English, as well as German and Italian translations, so I'm confident of the endpoints used and the token being sent. I should note that Russian translations also fail with a HTTP 400 Error.

I have tried setting all xml:lang= “ja-JP”, ensuring the format is uff-8, and removing both the mstts and emo namespaces, all of which did not work. I googled and found two other possible solutions, both of which also did not work: the first, explicitly NOT setting the Content-Length header, and the second, wrapping the translated text in a (the second really shouldn’t have had any effect).

Any ideas what may be going on here? I don’t think it’s a bug in TTS because the only account of something similar occurring seems to have been resolved (although, as I mentioned, not setting the Content-Length does not seem to work).

Thanks.

Azure Text To Speech (TTS) fails with HTTP 400 Error on Japanese Characters

Answers (1)

Related Questions