Franken
Franken

Reputation: 439

How to identify different speaker with Google Speech cloud api?

I'm building something like a "brainstorming" tool: A group of people can shout terms into a microphone. The input is translated into text (google speech to text) and displayed in a word cloud. The word cloud groups the same words (or terms). But I can't identify the individual terms correctly. Google can only split the input if a long silence is between them. If two people shout short after each other the different ideas are handled as one single idea. Thats not what I want. Any ideas? E.g. one person says "dark blue" and one person says "dark red". Google gives me one output "dark blue dark red".

Upvotes: 1

Views: 304

Answers (1)

Nikolay Shmyrev
Nikolay Shmyrev

Reputation: 25220

They have experimental speaker diarization function, it does not work very reliably though. Speaker separation is supported by other toolkits and APIs too.

Upvotes: 1

Related Questions