How to identify different speaker with Google Speech cloud api?

Question

I'm building something like a "brainstorming" tool: A group of people can shout terms into a microphone. The input is translated into text (google speech to text) and displayed in a word cloud. The word cloud groups the same words (or terms). But I can't identify the individual terms correctly. Google can only split the input if a long silence is between them. If two people shout short after each other the different ideas are handled as one single idea. Thats not what I want. Any ideas? E.g. one person says "dark blue" and one person says "dark red". Google gives me one output "dark blue dark red".

How to identify different speaker with Google Speech cloud api?

Answers (1)

Related Questions