mani bharataraju
mani bharataraju

Reputation: 162

processing human voice

I am trying to make an android app that checks whether the recorded voice of a person is of high frequency or not.I have completed till the recording part but don't know how to proceed further. After searching I found that FFT algorithm must be used but the problem is how to get the array values that must be passed as the input to the algorithm. Can anyone help please?

Upvotes: 0

Views: 393

Answers (2)

Fredrik
Fredrik

Reputation: 2317

Assuming you have defined what is meant by "contains high frequency", and you merely need a measure of this (no need to visualize the frequency content in a graph), there is really no need to calculate the FFT.

I would calculate the RMS values of the signal (a measure of the total energy), then apply a low-pass filter on the data (in the time domain) and calculate the RMS values again on the filtered signal. Comparing the loss of energy is your measure of how much high frequency content was responsible for your initial energy value.

REPLY TO COMMENT:

You need data in order to process it! Perhaps I dont understand your question? of what do you wish to "get exact values of" You have stated you "completed the recording part" so i assume you have the signal stored in memory, now you need to calculate the total energy of the signal in order to either A) calculate change of energy after filtering or B) compare energy to some predefined hardcoded value (bad idea btw).

Either way, this should be done in the time-domain if all you want is a measure/value. As stated by Parseval's theorem, there is no need to perform cpu intensive processing and go over to the frequency domain to calculate the energy of a signal. http://en.wikipedia.org/wiki/Parseval's_theorem

ELABORATION:

When you record the user's voice (collect data for your signal) you need to ensure the data is not lost and is properly stored in memory (in some array-type object) and that you have a reference to this array. Once the data is collected, you dont need to convert your signal into values, it is already stored as a sequence of values. Therefore, you are now ready to perform some calculation in order to get a measure of "how much high frequencies there are"...

The RMS (root mean square) value is a standardized way of measuring the total energy of a signal - you take the "square-root of the average of all values squared". See http://mathworld.wolfram.com/Root-Mean-Square.html

The RMS is quick and easy to calculate, but it gives you the energy of the total signal, low frequency components and high frequency components together and there is no way of knowing if a high RMS value is due to alot of high frequency components or low frequency components. Therefore, I suggest, you remove the high frequency components and calculate the RMS value again to see how much the total energy changed in doing so, ie. how much the high frequencies was responsible for the initial "raw" RMS value. Dividing the two values is your high frequency ratio measure... Im not sure this is what you want to do, but its what I would do.

In order perform low pass filtering you need to pick a frequency value Fcut and say anything over this is considered "high", then apply a low pass filter with the cut off point set to Fcut, applying a filter is done in the time domain by means of convolution.

Upvotes: 2

Alexander
Alexander

Reputation: 48232

Usually they use AudioRecord class. It writes raw PCM data then they can do some calculations on the data.

Upvotes: 0

Related Questions