Reputation: 11
I'm trying to normalize an array of numbers to the range (0, 1] so that I can use them as weights for a weighted random.choice(), so I entered this line:
# weights is a nonzero numpy.array
weights /= weights.max()
However, Pycharm said there's an unfilled parameter to the max()
function (Parameter 'initial' unfilled
). I tried this in the REPL with the /= operator and with "regular" division (a = a / b
) and got different results for both and a different error than Pycharm thought:
>>> a = numpy.array([1,2,3])
>>> a.max()
3
>>> a /= a.max()
Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
a /= a.max()
TypeError: No loop matching the specified signature and casting was found for ufunc true_divide
>>> a = a/a.max()
>>> a
array([0.33333333, 0.66666667, 1. ])
I also realized that for a weighted random, the weights needed to sum to one rather than be normalized to it. But dividing it by the sum yielded the exact same TypeError
using the /= operation (but Pycharm thought this was okay):
>>> a = numpy.array([1,2,3])
>>> sum(a)
6
>>> a
array([1, 2, 3])
>>> a /= sum(a)
Traceback (most recent call last):
File "<pyshell#13>", line 1, in <module>
a /= sum(a)
TypeError: No loop matching the specified signature and casting was found for ufunc true_divide
>>> a = a / sum(a)
>>> a
array([0.16666667, 0.33333333, 0.5 ])
What have I come across here? Is this some bizarre bug in Numpy or does the /= operator have a different use or something? I know they use __truediv__
and __itruediv__
but I can't see why one has a problem and the other doesn't. I have confirmed this behavior with the latest version of Numpy from pip (1.19.2 on Windows x64).
Upvotes: 1
Views: 228
Reputation: 348
It's because numpy cares the type. When you apply the division, you're changing int
to float
. But numpy won't let you do that! That's why your values should be already in float
. Try this:
>>> a = np.array([1.0,2.0,3.0])
>>> a /= sum(a)
>>> a
array([0.16666667, 0.33333333, 0.5 ])
But why did the other one work? It's because that's not an "in-place" operation. Hence a new memory location is being created. New variable, new type, hence numpy doesn't care here.
Upvotes: 1