Harry Stuart
Harry Stuart

Reputation: 1945

Filter out certain voice Google Speech API

I am creating a voice bot using Google Speech streaming API and Google Text to Speech. I would like only the user's speech to be transcribed, even if the user "interrupts" the voicebot's response. How can I avoid the voicebot from transcribing its own voice?

The capability to "filter out" certain voices seems to be doable as per my testing with existing voicebots such as Siri when on speaker.

Thanks

Upvotes: 3

Views: 120

Answers (1)

Alexander Solovets
Alexander Solovets

Reputation: 2517

While there is no such capability in Google Speech API out of the box, you may try some well-known algorithms. Audio waves are additive, so subtracting an audio stream from itself equals to zero (silence). With that in mind and having a separate stream for your voicebot audio output, one approach would be to subtract the voicebot's speech from the user's input speech. If you do not have access to either of the audio streams or cannot separate them, another approach would be to apply speaker diarisation to extract two voice sources from the one.

Note, that if you do a naive subtraction of two streams you may not achieve the desired effect, because subtraction will also attenuate the audio. Instead, you need to invert the subtracted stream and mix it with the one from which it is subtracted.

Upvotes: 2

Related Questions