Reputation: 568
I am using the Azure API to get audio for words. This is my code in C#, and it worked fine until I noticed that for some words it returns empty audio.
string subscriptionKey = Environment.GetEnvironmentVariable("STT_API_KEY");
Authentication auth = new Authentication("https://eastus.api.cognitive.microsoft.com/sts/v1.0/issueToken", subscriptionKey);
try
{
accessToken = await auth.FetchTokenAsync().ConfigureAwait(false);
//Console.WriteLine("Successfully obtained an access token. \n");
}
catch (Exception ex)
{
//Console.WriteLine("Failed to obtain an access token.");
//Console.WriteLine(ex.ToString());
//Console.WriteLine(ex.Message);
return null;
}
string host = "https://eastus.tts.speech.microsoft.com/cognitiveservices/v1";
// Create SSML document.
XDocument body = new XDocument(
new XElement("speak",
new XAttribute("version", "1.0"),
new XAttribute(XNamespace.Xml + "lang", "en-US"),
new XElement("voice",
new XAttribute(XNamespace.Xml + "lang", "en-US"),
new XAttribute(XNamespace.Xml + "gender", "Female"),
new XAttribute("name", "en-US-AvaNeural"), // Short name for "Microsoft Server Speech Text to Speech Voice (en-US, Jessa24KRUS)"
text)));
using (HttpClient client = new HttpClient())
{
using (HttpRequestMessage request = new HttpRequestMessage())
{
// Set the HTTP method
request.Method = HttpMethod.Post;
// Construct the URI
request.RequestUri = new Uri(host);
// Set the content type header
request.Content = new StringContent(body.ToString(), Encoding.UTF8, "application/ssml+xml");
// Set additional header, such as Authorization and User-Agent
request.Headers.Add("Authorization", "Bearer " + accessToken);
request.Headers.Add("Connection", "Keep-Alive");
// Update your resource name
request.Headers.Add("User-Agent", "YOUR_RESOURCE_NAME");
// Audio output format. See API reference for full list.
request.Headers.Add("X-Microsoft-OutputFormat", "riff-24khz-16bit-mono-pcm");
// Create a request
//Console.WriteLine("Calling the TTS service. Please wait... \n");
using (HttpResponseMessage response = await client.SendAsync(request).ConfigureAwait(false))
To investigate the issue, I tried calling the same API through other means: directly via CURL and also through Python with this code.
endpoint = 'https://eastus.tts.speech.microsoft.com/cognitiveservices/v1'
ssml = """
<speak version='1.0' xml:lang='en-US'>
<voice xml:lang='en-US' xml:gender='Female' name='en-US-AvaNeural'>
black
</voice>
</speak>
"""
headers = {
'Ocp-Apim-Subscription-Key': subscription_key,
'Content-Type': 'application/ssml+xml',
'X-Microsoft-OutputFormat': 'audio-16khz-32kbitrate-mono-mp3'
}
response = requests.post(endpoint, headers=headers, data=ssml, verify=False)
To my surprise, I was able to get the audio for words that I couldn't get through C#.
I wanted to compare the requests to understand why I am not getting the same results in different cases. To do this, I opened Wireshark and configured TLS to use SSLKEYLOGFILE so that I could see the frames unencrypted. I ran the request through Python and saw in Wireshark that an HTTP request with all the parameters I set was being made. However, when I ran the request through C# and filtered by the address, I didn't see any HTTP request at all (only TCP and TLS), and the same happened with CURL.
I tried to find the reason why HTTP requests might not be visible, and the answers I saw suggested it was because the traffic is encrypted. However, I saw that the decryption worked when the code was run from Python. Why should there be any difference based on which code the request originates from?
Upvotes: 1
Views: 55
Reputation: 1532
There is currently no support in the .NET Framework for writing the pre-master secret to a file that Wireshark needs to decrypt TLS.
There are a few options:
Maybe the discussions in this question or this github topic can help you.
Upvotes: 1