Gunnar S.
Gunnar S.

Reputation: 91

Is it possible to filter "interjections"/"humming" in MS Speech-to-Text?

We are experimenting with transcribing video materials using (among others) Microsoft Speech-to-Text (specifically, using the C# API). The results we're getting from Microsoft often contain lots of "interjections"/"humming" (uncertain of the correct term here), such as "hmm", "uhm", etc., whereas other providers seem to filter these out automatically. In some cases, it may be meaningful to include these in the results, but in other settings it would be nice if there was a way to configure the SpeechRecognizer to exclude them. Is there a way to accomplish this?

Upvotes: 2

Views: 331

Answers (1)

Brian Mouncer
Brian Mouncer

Reputation: 71

Our backend engine has this ability. However it is not currently publicly documented, and I am not sure how you would send this selection from the client to the service. Right now it is the default setting for some endpoints but not others ("internet search" as apposed to "dictation").

I will have to talk to one of our service engineers to see if it is possible to change this dynamically from the client, and get back to you with a better response.

Thanks,

Brian.

--- Update ---

I talked with one of our service engineers, and the feature is called TrueText formatting. I did some digging through our tests and documentation and it is actually documented publicly here.

https://learn.microsoft.com/en-us/dotnet/api/microsoft.cognitiveservices.speech.propertyid?view=azure-dotnet

https://learn.microsoft.com/en-us/dotnet/api/microsoft.cognitiveservices.speech.speechconfig?view=azure-dotnet

An example of how to call/set this is on the SpeechConfig object would be like this...

        var trueText = "TrueText";
        myDefaultConfig.SetProperty(PropertyId.SpeechServiceResponse_PostProcessingOption, trueText);

The docs currently do not show the other state, which I believe is "Normal" instead of "TrueText". I will try to find time this week to try this out myself, and improve the documentation on this property id.

Upvotes: 1

Related Questions