Reputation: 1
I'm having trouble transcribing email using Google Speech REST API. The best I can get is most of the email address, however Google Speech ignores "dot" and "dot com". For example [email protected] returns "First Last at gmail". If I say "period" instead of "dot" I at least get "First. Last at gmail." I'm using the following:
{
"config": {
"encoding": "MULAW",
"sampleRateHertz": 8000,
"languageCode": "en-US",
"maxAlternatives": 0,
"profanityFilter": true,
"enableWordTimeOffsets": false,
"model": "phone_call",
"useEnhanced": true
},
"audio": {
"content":"&&NameBase64&&"
}
}
I've tried add "dot" as a speech context with no changes. ".", ".com", "com", and "kom" also didn't change the results.
{
"config": {
"encoding": "MULAW",
"sampleRateHertz": 8000,
"languageCode": "en-US",
"maxAlternatives": 1,
"profanityFilter": true,
"enableWordTimeOffsets": false,
"model": "phone_call",
"useEnhanced": true,
"speechContexts": [{
"phrases": ["dot"],
}],
},
"audio": {
"content":"Base64Recording"
}
}
I've tried adding alphanumberic speech contexts and spelling it out but the results were pretty bad.
Any thoughts on how I can get "." or "dot" and "com" to show up in the transcription would be greatly appreciated.
Upvotes: 0
Views: 276
Reputation: 11
Have you tried providing a boost value for the phrase? I'm facing the same issue and I noticed that increasing the boost value helped in identifying the word "dot". Boost values are usually between 0 and 20, but applying anything above 10 helped in recognizing the "dot".
Here's an example:-
"config": {
"encoding": "MULAW",
"sampleRateHertz": 8000,
"languageCode": "en-US",
"maxAlternatives": 1,
"profanityFilter": true,
"enableWordTimeOffsets": false,
"model": "phone_call",
"useEnhanced": true,
"speechContexts": [{
"phrases": ["dot"],
"boots": 15.0
}],
},
"audio": {
"content":"Base64Recording"
}
}
You can also have multiple key value pairs in the context, each with different boost values. For example, this is what I use to detect email addresses:-
[{
phrases: ["$OOV_CLASS_ALPHANUMERIC_SEQUENCE"],
boost: 14.0
},
{
phrases: ["gmail.com","yahoo.com","aol.com","outlook.com"],
boost:5.0
},
{
phrases: ["com",".","c o m",".com","dotcom","dot com","dot","at","at the rate","@"],
boost: 10.0
},
{
phrases: ["org","io","dot org","dot io","gov","dot gov","net","dot net","co","dot co"],
boost:8.0
},
{
phrases: ["$OOV_CLASS_DIGIT_SEQUENCE","8","naught","z","zed","zee","zz","d","aa","ae","ee","oo","ii","ay","eh","ahh","ah","ze","dee",
"1","2","3","4","5","6","7","8","9","0","zero",],
boost: -20.0
}
]
Notice, the phrases with negative boost values will help weed out words that are often misunderstood.
Upvotes: 1