Reputation: 703
I have a program where I am recording 24/7 a 1.5sec audio buffer from the microphone. Now, instead of sending the buffer (which has no sound at all) to the server, which just makes it more laggy and wastes the bandwith, I want to check wheter the buffer has sound in it or not..
The buffer looks like this :
short int *waveIn = new short int[NUMPTS];
where :
const double seconds = 1;
const int sampleRate = 8000;
const int NUMPTS = sampleRate * seconds;
So, I have an array of short ints which has 8000 cells, which stores the audio buffer...
Now, from checking with the Visual Studio Debugger, after I've been capturing the microphone audio into the buffer, the buffer looks like this :
waveIn[0] = -125
waveIn[1] = -780;
waveIn[2] = -1320;
and so on...
Now, I need to detect using this buffer, if it has captured audio, or it is just a buffer containing no sound...
After running it couple of times, I've noticed that when the buffer does have sound inside it, the cells contain smaller numbers. for example, an array with sound in it will often look like this :
waveIn[0] = -1300;
waveIn[1] = -3200;
waveIn[2] = -2400;
Now, my problem is, that some times, the buffer that contains the audio, has big numbers (that are closer to the 0), even though theres sound inside..
So for example, some times the cells can have numbers in range of -600 ~ -1200 and have nothing inside them, and some times, they can have numbers in range of -600 ~ 1200 and actually contains sound inside..
So, How can I detect wheter a audio buffer has sound inside or not ?
I hope I was clear enough...
Thanks!
Edit: I forgot to mention, I'm useing Wave API to handle the audio...
Upvotes: 1
Views: 1008
Reputation: 10415
Assuming you are using WAVE_FORMAT_PCM the individual samples can range between 32K and -32K, with silence being small numbers near 0. To compute the magnitude of the sound you should take the absolute value of a number of samples (positive and negative samples are equally significant), then average them. Looking at only 3 samples is very inadequate (that's only 3/8000th of second) so pick an interval comparable to a real sound, such as a few tenths of a second. There is no magic magnitude threshold that means sound is present, so a better strategy is to compare the magnitude of successive intervals, or even a running average, looking for a change from low (near-quiet) to substantially higher (louder). So you will have a moving threshold based on the background noise level.
Upvotes: 2