Reputation: 101
I have a large .wav file array (200k samples) loaded in with scipy.io.wavfile. I tried to make a histogram of the data using matplotlib.pyplot hist with auto binning. It returned the error:
ValueError: Number of samples, -72, must be non-negative.
So I decided to set the bins myself using binwidth=1000:
min_bin = np.min(data[peaks])
max_bin = np.max(data[peaks])
plt.hist(data[peaks], bins=np.arange(min_bin,max_bin, binwidth))
When I do this, it gives the error:
RuntimeWarning: overflow encountered in short_scalars
from scipy.io import wavfile
Here are the type print outs of min_bin, max_bin, data:
Type min_bin: <class 'numpy.int16'> max_bin: <class 'numpy.int16'>
min_bin: -21231 max_bin: 32444
Type data <class 'numpy.ndarray'>
The problem seems to be with np.arange which fails when I provide it the bin range from the np.max and np.min .wav array values. When I manually type the max and min integer values into np.arange it has no problem. My hypothesis is that it is some sort of addressing error when referencing the .wav array but not sure how to fix it or why it is occurring.
Upvotes: 2
Views: 3274
Reputation: 619
As part of the computation of the length of the array, numpy.arange
calculates stop - start
, in Python object arithmetic. When stop
and start
are numpy.int16(32444)
and numpy.int16(-21231)
, this subtraction overflows and produces numpy.int16(-11861)
. This is where the warning comes from. The nonsense value leads numpy.arange
to believe that the result should be a length-0 array.
The workaround is simple; just convert the arguments to int
s first. The dtype of the array itself can still be set to np.int16
to save space, since that's all you need to store the necessary data.
min_bin = int(np.min(data[peaks]))
max_bin = int(np.max(data[peaks]))
plt.hist(data[peaks], bins=np.arange(min_bin, max_bin, binwidth, dtype=np.int16))
Upvotes: 3