today
today

Reputation: 33410

How to load and resample (MP3) audio files faster in Python/Linux?

Currently, I am trying to load 280,000 MP3 audio files in Python where the average duration of files is ~5 seconds. I am using Librosa for this purpose as well as for the further processing (e.g. computing spectrogram) in later stages.

However, I realized that loading the files is very slow, as on average it takes 370 milliseconds for each file to be loaded, uncompressed and re-sampled. If I turn off the re-sampling (i.e. librosa.load(..., sr=None)), it takes around 200 milliseconds but that's not still good considering the large number of files I have. Unsurprisingly, loading wav files without re-sampling is very fast (< 1 ms); but if we perform the re-sampling, it takes around 160 milliseconds.

Now I was wondering if there is any faster approach for doing this, whether directly in Python or using external tools in Linux with the condition that I can later load the results back to Python.

By the way, I have tried using multiprocessing with a pool of size 4 and achieved 2-3x speed-up, but I am looking for more (preferably > 10x).

Note: the original files are human voice and have a sample rate of 48KHz and a bit-rate of 64 Kbps; I want to downsample them to 16KHz.

Upvotes: 1

Views: 4139

Answers (1)

Hendrik
Hendrik

Reputation: 5310

You could use pysox.

It's a thin Python wrapper around SoX, "the Swiss Army knife of sound processing programs."

Note: For faster processing (avoiding exec calls), you may also install and use soxbindings. All you need to do is to replace

import sox

with

import soxbindings as sox

Upvotes: 7

Related Questions