叶辰哲
叶辰哲

Reputation: 33

Detect the voice of multiple person speaking

I want to create an Android application that, when I use my own TTS (Text-to-Speech) function to speak, listens to the surrounding sounds and triggers an event when it detects someone else speaking. My idea is to use FFT to compare the original TTS audio with external user voices. Is there any method to achieve this?

I have tried to use the below logic to compare the FFT genrate by JTransforms

private fun calculateCosineSimilarity(vector1: DoubleArray, vector2: DoubleArray): Double {
        if(vector1.size != vector2.size){
            return 0.0
        }

        var dotProduct = 0.0
        var norm1 = 0.0
        var norm2 = 0.0
        var i = 0
        while (i < vector1.size) {
            dotProduct += vector1[i] * vector2[i] + vector1[i + 1] * vector2[i + 1]
            norm1 += Math.pow(vector1[i], 2.0) + Math.pow(vector1[i + 1], 2.0)
            norm2 += Math.pow(vector2[i], 2.0) + Math.pow(vector2[i + 1], 2.0)
            i += 2
        }
        return dotProduct / (Math.sqrt(norm1) * Math.sqrt(norm2))
    }

Upvotes: 0

Views: 87

Answers (0)

Related Questions