Reputation: 9348
In getting basic parameters of an audio file, by Wave:
import wave
data = wave.open('c:\\sample.wav', mode = 'rb')
params = data.getparams()
print params
It returns:
(1, 2, 4000, 160000, 'NONE', 'not compressed')
That's for: nchannels=1, sampwidth=2, framerate=16000, nframes=47104, comptype='NONE', compname='not compressed
I assume Librosa has similar functions but no find-outs after searches.
Does Librosa have commands to produce similar results?
Thank you.
Upvotes: 6
Views: 6538
Reputation: 1653
librosa
does not have a specific function to load the metadata of an audio file as of version 0.10.1. There are a couple of functions for getting individual pieces of information, namely get_samplerate(path)
and get_duration(*, path, ...)
(which returns the duration in seconds and not in samples), but that's it.
To get the number of channels you must load the file and then inspect the shape of the audio. However, librosa
does a lot of things under the hood when you load a file: by default, it will convert the signal to single channel and 22,050 Hz. If you want to read the file as it is stored on disk you must read it this way:
import librosa as lr
audio_fp = "path/to/audio/file"
duration = lr.get_duration(path=audio_fp)
audio, sr = lr.load(audio_fp, sr=None, mono=False, duration=0.01)
# n_samples is approximate, but should be correct most of the times
n_samples = int(round(duration * sr))
if audio.ndim == 1:
channels = 1
else:
channels = audio.shape[0]
The duration
keyword asks librosa to read only the first 0.01 seconds of the audio, which is enough to find out the number of channels. This can enormously speed up the process, especially when you are dealing with long files.
Some %timeit
magic shows that the wave method takes ~26 us, while the librosa stack takes ~99 us. Not using the duration keyword on a 60-minute track requires ~221 ms (that's milliseconds, not microseconds — more than a thousand times slower). An advantage of librosa is that it can transparently deal with encoded files such as mp3 and flac with only a minor hit in performance (~125 us).
Upvotes: 1
Reputation: 180
Librosa Core has some of the functionality you're looking for. librosa.core.load
will load a file like wave
, but will not give as detailed information.
import librosa
# Load an example file in .ogg format
fileName = librosa.util.example_audio_file()
audioData, sampleRate = librosa.load(fileName)
print(audioData)
>>> [ -4.756e-06, -6.020e-06, ..., -1.040e-06, 0.000e+00]
print(audioData.shape)
>>> (1355168,)
print(sampleRate)
>>> 22050
The shape of audioData
will tell you the number of channels. A shape like (n,)
is mono, and (2, n)
is stereo. The n
in the shape is the length of the audio in samples. If you want the length in seconds check out librosa.core.get_duration
.
Like @hendrick mentions in his comment, the Librosa advanced I/O page says librosa
uses soundfile
and audioread
for audio I/O, and the load
source code shows it's just wrapping around those libraries.
However, there shouldn't be any issue with using wave
for loading the audio file and librosa
for analysis as long as you follow the librosa API. Is there a particular problem you're having, or goal you need to achieve?
Upvotes: 7