Reputation: 60001
I'm tasked with building a .NET client app to detect silence in a WAV files.
Is this possible with the built-in Windows APIs? Or alternately, any good libraries out there to help with this?
Upvotes: 33
Views: 35310
Reputation: 785
Here a nice variant to detect threshold alternatings:
static class AudioFileReaderExt
{
private static bool IsSilence(float amplitude, sbyte threshold)
{
double dB = 20 * Math.Log10(Math.Abs(amplitude));
return dB < threshold;
}
private static bool IsBeep(float amplitude, sbyte threshold)
{
double dB = 20 * Math.Log10(Math.Abs(amplitude));
return dB > threshold;
}
public static double GetBeepDuration(this AudioFileReader reader,
double StartPosition, sbyte silenceThreshold = -40)
{
int counter = 0;
bool eof = false;
int initial = (int)(StartPosition * reader.WaveFormat.Channels * reader.WaveFormat.SampleRate / 1000);
if (initial > reader.Length) return -1;
reader.Position = initial;
var buffer = new float[reader.WaveFormat.SampleRate * 4];
while (!eof)
{
int samplesRead = reader.Read(buffer, 0, buffer.Length);
if (samplesRead == 0)
eof = true;
for (int n = initial; n < samplesRead; n++)
{
if (IsBeep(buffer[n], silenceThreshold))
{
counter++;
}
else
{
eof=true; break;
}
}
}
double silenceSamples = (double)counter / reader.WaveFormat.Channels;
double silenceDuration = (silenceSamples / reader.WaveFormat.SampleRate) * 1000;
return TimeSpan.FromMilliseconds(silenceDuration).TotalMilliseconds;
}
public static double GetSilenceDuration(this AudioFileReader reader,
double StartPosition, sbyte silenceThreshold = -40)
{
int counter = 0;
bool eof = false;
int initial = (int)(StartPosition * reader.WaveFormat.Channels * reader.WaveFormat.SampleRate / 1000);
if (initial > reader.Length) return -1;
reader.Position = initial;
var buffer = new float[reader.WaveFormat.SampleRate * 4];
while (!eof)
{
int samplesRead = reader.Read(buffer, 0, buffer.Length);
if (samplesRead == 0)
eof=true;
for (int n = initial; n < samplesRead; n++)
{
if (IsSilence(buffer[n], silenceThreshold))
{
counter++;
}
else
{
eof=true; break;
}
}
}
double silenceSamples = (double)counter / reader.WaveFormat.Channels;
double silenceDuration = (silenceSamples / reader.WaveFormat.SampleRate) * 1000;
return TimeSpan.FromMilliseconds(silenceDuration).TotalMilliseconds;
}
}
Main usage:
using (AudioFileReader reader = new AudioFileReader("test.wav"))
{
double duratioff = 1;
double duration = 1;
double position = 1;
while (duratioff >-1 && duration >-1)
{
duration = reader.GetBeepDuration(position);
Console.WriteLine(duration);
position = position + duration;
duratioff = reader.GetSilenceDuration(position);
Console.WriteLine(-duratioff);
position = position + duratioff;
}
}
Upvotes: 6
Reputation: 19641
I'm using NAudio, and I wanted to detect the silence in audio files so I can either report or truncate.
After a lot of research, I came up with this basic implementation. So, I wrote an extension method for the AudioFileReader
class which returns the silence duration at the start/end of the file, or starting from a specific position.
Here:
static class AudioFileReaderExt
{
public enum SilenceLocation { Start, End }
private static bool IsSilence(float amplitude, sbyte threshold)
{
double dB = 20 * Math.Log10(Math.Abs(amplitude));
return dB < threshold;
}
public static TimeSpan GetSilenceDuration(this AudioFileReader reader,
SilenceLocation location,
sbyte silenceThreshold = -40)
{
int counter = 0;
bool volumeFound = false;
bool eof = false;
long oldPosition = reader.Position;
var buffer = new float[reader.WaveFormat.SampleRate * 4];
while (!volumeFound && !eof)
{
int samplesRead = reader.Read(buffer, 0, buffer.Length);
if (samplesRead == 0)
eof = true;
for (int n = 0; n < samplesRead; n++)
{
if (IsSilence(buffer[n], silenceThreshold))
{
counter++;
}
else
{
if (location == SilenceLocation.Start)
{
volumeFound = true;
break;
}
else if (location == SilenceLocation.End)
{
counter = 0;
}
}
}
}
// reset position
reader.Position = oldPosition;
double silenceSamples = (double)counter / reader.WaveFormat.Channels;
double silenceDuration = (silenceSamples / reader.WaveFormat.SampleRate) * 1000;
return TimeSpan.FromMilliseconds(silenceDuration);
}
}
This will accept almost any audio file format not just WAV.
Usage:
using (AudioFileReader reader = new AudioFileReader(filePath))
{
TimeSpan duration = reader.GetSilenceDuration(AudioFileReaderExt.SilenceLocation.Start);
Console.WriteLine(duration.TotalMilliseconds);
}
References:
Upvotes: 11
Reputation: 3982
Audio analysis is a difficult thing requiring a lot of complex math (think Fourier Transforms). The question you have to ask is "what is silence". If the audio that you are trying to edit is captured from an analog source, the chances are that there isn't any silence... they will only be areas of soft noise (line hum, ambient background noise, etc).
All that said, an algorithm that should work would be to determine a minimum volume (amplitude) threshold and duration (say, <10dbA for more than 2 seconds) and then simply do a volume analysis of the waveform looking for areas that meet this criteria (with perhaps some filters for millisecond spikes). I've never written this in C#, but this CodeProject article looks interesting; it describes C# code to draw a waveform... that is the same kind of code which could be used to do other amplitude analysis.
Upvotes: 14
Reputation: 492
See code below from Detecting audio silence in WAV files using C#
private static void SkipSilent(string fileName, short silentLevel)
{
WaveReader wr = new WaveReader(File.OpenRead(fileName));
IntPtr format = wr.ReadFormat();
WaveWriter ww = new WaveWriter(File.Create(fileName + ".wav"),
AudioCompressionManager.FormatBytes(format));
int i = 0;
while (true)
{
byte[] data = wr.ReadData(i, 1);
if (data.Length == 0)
{
break;
}
if (!AudioCompressionManager.CheckSilent(format, data, silentLevel))
{
ww.WriteData(data);
}
}
ww.Close();
wr.Close();
}
Upvotes: -2
Reputation: 175583
http://www.codeproject.com/Articles/19590/WAVE-File-Processor-in-C
This has all the code necessary to strip silence, and mix wave files.
Enjoy.
Upvotes: 10
Reputation: 8476
If you want to efficiently calculate the average power over a sliding window: square each sample, then add it to a running total. Subtract the squared value from N samples previous. Then move to the next step. This is the simplest form of a CIC Filter. Parseval's Theorem tells us that this power calculation is applicable to both time and frequency domains.
Also you may want to add Hysteresis to the system to avoid switching on&off rapidly when power level is dancing about the threshold level.
Upvotes: 10
Reputation: 29143
Use Sox. It can remove leading and trailing silences, but you'll have to call it as an exe from your app.
Upvotes: 1
Reputation: 1582
I don't think you'll find any built-in APIs for detection of silence. But you can always use good ol' math/discreete signal processing to find out loudness. Here's a small example: http://msdn.microsoft.com/en-us/magazine/cc163341.aspx
Upvotes: 1