How to interpret various colors in matplotlib plot of a mp3/wav file

Question

I'm a python newbie and audio analysis newbie. If this is not the right place for this question, please point me to right place.

I have an mp3 audio file which has just silence. Converted to .wav using sox sox input.mp3 output.wav

from scipy.io.wavfile import read
import matplotlib.pyplot as plt
(fs,x)=read('/home/vivek/Documents/VivekProjects/Silence/silence.wav')
##plt.rcParams['agg.path.chunksize'] = 5000 # for preventing overflow error. 
fs
x.size/float(fs)
plt.plot(x)

Which generates this image:

I also used solution to this question: How to plot a wav file

    from scipy.io.wavfile import read
    import matplotlib.pyplot as plt

    # read audio samples
    from scipy.io.wavfile import read
import matplotlib.pyplot as plt

# read audio samples
input_data = read("/home/vivek/Documents/VivekProjects/Silence/silence.wav")
audio = input_data[1]
# plot the first 1024 samples
plt.plot(audio)
# label the axes
plt.ylabel("Amplitude")
plt.xlabel("Time")
# set the title  
plt.title("Sample Wav")
# display the plot
plt.show()

Which generated this image:

Question: I want to know how to interpret the different color bars(blue green,yellow) in the chart. If you listen to the file it is only silence, and I expected to see just a flat line if anything.

My mp3 file can be downloaded from here.
The sox converted wav file can be found here.

Even though the file is silent, even dropbox is generating a waveform. I can't seem to figure out why.

shouldsee · Accepted Answer

First, always check the shape of your data before plotting.

x.shape
## (3479040, 2)

So the 2 here means you have two channel in your .wav file, matplotlib by default plot them in different colors. You will need to slice the matrix by row in this situation.

import matplotlib.pyplot as plt
ind = int(fs * 0.5) ## plot first 500ms
### plot as time series
plt.plot(x[:ind,:])
plt.figure()

#### Visualise distribution
plt.hist(x[:ind,0],bins = 10)
plt.gca().set_yscale('log')

##### 
print x.min(),x.max()
#### -3 3

As can be seen from the graph, the signal is of very low absolute value (-3,3). Depending on the encoding of .wav file (integer or float), it will be translated to amplitude (but probably a very low amplitude, that's why it's silent).

I my self is not familiar with the precise encoding. But this page might help: http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html

For all formats other than PCM, the Format chunk must have an extended portion. The extension can be of zero length, but the size field (with value 0) must be present.

For float data, full scale is 1. The bits/sample would normally be 32 or 64.

For the log-PCM formats (µ-law and A-law), the Rev. 3 documentation indicates that the bits/sample field (wBitsPerSample) should be set to 8 bits.

The non-PCM formats must have a fact chunk.

PS: if you want to start some more advanced audio analysis, do check this workshop which I found super practical, especially the Energy part and FFT part.

How to interpret various colors in matplotlib plot of a mp3/wav file

Answers (2)

Related Questions