Reputation: 6713
I want to be able to identify an audio sample (that is provided by the user) in a audio file I've got (mp3).
The mp3 file is a radio stream that I've kept for testing purposes, and I have the Pre-roll of the show. I want to identify it in the file and get the timestamp where it's playing in the file.
Note: The solution can be in any of the following programming languages: Java, Python or C++. I don't know how to analyze the video file and any reference about this subject will help.
Upvotes: 2
Views: 691
Reputation: 7471
This problem falls under the category of audio fingerprinting. If you have matched a sample to a song, then you'll certainly know the timestamp where the sample occurs within the song. There is a great paper by the guys behind Shazam that describes their technique: http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf They basically pick out the local maxima in the spectrogram and create a hash based on their relative positions.
Here is a good review on audio fingerprinting algorithms: http://mtg.upf.edu/files/publications/MMSP-2002-pcano.pdf
In any case, you'll likely be working a lot with FFT and spectrograms. This post talks about how to do that in Python.
Upvotes: 3
Reputation: 168843
I'd start by computing the FFT spectrogram of both the haystack and needle files (so to speak). Then you could try and (fuzzily) match the spectrograms - if you format them as images, you could even use off-the-shelf algorithms for that.
Not sure if that's the canonical or optimal way, but I feel like it should work.
Upvotes: 2