Reputation: 153
I have some audio files recorded from wind turbines, and I'm trying to do anomaly detection. The general idea is if a blade has a fault (e.g. cracking), the sound of this blade will differ with other two blades, so we can basically find a way to extract each blade's sound signal and compare the similarity / distance between them, if one of this signals has a significant difference, we can say the turbine is going to fail. I only have some faulty samples, labels are lacking.
However, there seems to be no one doing this kind of work, and I met lots of troubles while attempting. I've tried using stft to convert the signal to power spectrum, and some spikes show. How to identify each blade from the raw data? (Some related work use AutoEncoders to detect anomaly from audio, but in this task we want to use some similarity-based method.)
Anyone has good idea? Have some related work / paper to recommend?
Upvotes: 0
Views: 109
Reputation: 6299
Assuming that the audio you got is from a location where one can hear individual blades as they pass by, there are two subproblems:
1) Estimate each blade position, and extract the audio for each blade.
2) Compare the signal from each blade to eachother. Determine if one of them is different enough to be considered an anomaly
Estimating the blade position can be done with a sensor that detects the rotation directly. For example based on the magnetic field of the generator. Ideally you would have this kind known-good sensor data, at least while developing your system. It may be possible to estimate using only audio, using some sort of Periodicity Detection. Autocorrelation is a commonly used technique for that.
To detect differences between blades, you can try to use a standard distance function on a standard feature description, like Euclidean on MFCC. You will still need to have some samples for both known faulty examples and known good/acceptable examples, to evaluate your solution. There is however a risk that this will not be good enough. Then try to compute some better features as basis for the distance computation. Perhaps using an AutoEncoder. You can also try some sort of Similarity Learning. If you have a good amount of both good and faulty data, you may be able to use a triplet loss setup to learn the similarity metric. Feed in data for two good blades as objects that should be similar, and the known-bad as something that should be dissimilar.
Upvotes: 1
Reputation: 59368
Well...
If your shaft is rotating at, say 1200 RPM or 20 Hz, then all the significant sound produced by that rotation should be at harmonics of 20Hz.
If the turbine has 3 perfect blades, however, then it will be in exactly the same configuration 3 times for every rotation, so all of the sound produced by the rotation should be confined to multiples of 60 Hz.
Energy at the other harmonics of 20 Hz -- 20, 40, 80, 100, etc. -- that is above the noise floor would generally result from differences between the blades.
This of course ignores noise from other sources that are also synchronized to the shaft, which can mess up the analysis.
Upvotes: 1