ntoskrnl4
ntoskrnl4

Reputation: 11

numpy.array's have bizarre behavior with /= operator?

I'm trying to normalize an array of numbers to the range (0, 1] so that I can use them as weights for a weighted random.choice(), so I entered this line:

# weights is a nonzero numpy.array
weights /= weights.max()

However, Pycharm said there's an unfilled parameter to the max() function (Parameter 'initial' unfilled). I tried this in the REPL with the /= operator and with "regular" division (a = a / b) and got different results for both and a different error than Pycharm thought:

>>> a = numpy.array([1,2,3])
>>> a.max()
3
>>> a /= a.max()
Traceback (most recent call last):
  File "<pyshell#4>", line 1, in <module>
    a /= a.max()
TypeError: No loop matching the specified signature and casting was found for ufunc true_divide
>>> a = a/a.max()
>>> a
array([0.33333333, 0.66666667, 1.        ])

I also realized that for a weighted random, the weights needed to sum to one rather than be normalized to it. But dividing it by the sum yielded the exact same TypeError using the /= operation (but Pycharm thought this was okay):

>>> a = numpy.array([1,2,3])
>>> sum(a)
6
>>> a
array([1, 2, 3])
>>> a /= sum(a)
Traceback (most recent call last):
  File "<pyshell#13>", line 1, in <module>
    a /= sum(a)
TypeError: No loop matching the specified signature and casting was found for ufunc true_divide
>>> a = a / sum(a)
>>> a
array([0.16666667, 0.33333333, 0.5       ])

What have I come across here? Is this some bizarre bug in Numpy or does the /= operator have a different use or something? I know they use __truediv__ and __itruediv__ but I can't see why one has a problem and the other doesn't. I have confirmed this behavior with the latest version of Numpy from pip (1.19.2 on Windows x64).

Upvotes: 1

Views: 228

Answers (1)

Fahim
Fahim

Reputation: 348

It's because numpy cares the type. When you apply the division, you're changing int to float. But numpy won't let you do that! That's why your values should be already in float. Try this:

>>> a = np.array([1.0,2.0,3.0])
>>> a /= sum(a)
>>> a
array([0.16666667, 0.33333333, 0.5       ])

But why did the other one work? It's because that's not an "in-place" operation. Hence a new memory location is being created. New variable, new type, hence numpy doesn't care here.

Upvotes: 1

Related Questions