ekcaiki
ekcaiki

Reputation: 1

Azure Text To Speech (TTS) fails with HTTP 400 Error on Japanese Characters

I’m a student using Python to access the REST API for the TTS Azure Cognitive Service. I’m testing a file created with Azure’s online Audio Content Creation Tool with the following (a Japanese translation of I’m excited to try text to speech):

<speak xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="http://www.w3.org/2001/mstts" xmlns:emo="http://www.w3.org/2009/10/emotionml" version="1.0" xml:lang="en-US"><voice xml:lang="ja-JP" xml:gender="Female" name="ja-JP-NanamiNeural">テキスト読み上げを試すのが楽しみです</voice></speak>

The file works when run in the tool, but fails with a HTTP 400 Error when I use the REST API. However, when I use the following text as substitute for the Japanese Kanji, it also works on the REST API: Kisuto yomiage o tamesu no ga tanoshimidesu

Here is a snippet of the code I use:

def send_text2(token,endpoint,uuid,ssml):
    headers = {
        'Authorization': 'Bearer '  + token,
        'Content-Type': 'application/ssml+xml',
        'X-Microsoft-OutputFormat': 'audio-48khz-96kbitrate-mono-mp3',
        'User-Agent': 'Application for Final',
        'X-ClientTraceId': uuid

    }
    vocalize_request = requests.post(endpoint, headers=headers, data=ssml)
    print(vocalize_request.status_code)
    # Retrieve mp3 data and write to file
    with open(f"vocalized_file-{uuid}.mp3", "wb") as binary_file:
    #Write bytes to file
        binary_file.write(vocalize_request.content)

I have tried this code successfully on then original English, as well as German and Italian translations, so I'm confident of the endpoints used and the token being sent. I should note that Russian translations also fail with a HTTP 400 Error.

I have tried setting all xml:lang= “ja-JP”, ensuring the format is uff-8, and removing both the mstts and emo namespaces, all of which did not work.  I googled and found two other possible solutions, both of which also did not work: the first, explicitly NOT setting the Content-Length header, and the second, wrapping the translated text in a <! [CDATA[]]> (the second really shouldn’t have had any effect).

Any ideas what may be going on here?  I don’t think it’s a bug in TTS because the only account of something similar occurring seems to have been resolved (although, as I mentioned, not setting the Content-Length does not seem to work).

Thanks.

Upvotes: 0

Views: 285

Answers (1)

kosmos.ebi
kosmos.ebi

Reputation: 457

I guess this issue include multi-byte charactor in xml. You can escape string using xml.etree.ElementTree.

  1. ssml for Japanese save to xml file. (UTF-8 with BOM format)
  2. load ssml from xmlfile as follows:
import xml.etree.ElementTree as ET

tree = ET.parse('test.xml')
ssml = ET.tostring(tree.getroot(), encoding='utf8')
  1. execute post request with ssml
send_text2(token,endpoint,uuid,ssml)

Upvotes: 0

Related Questions