Basj
Basj

Reputation: 46493

Automatically truncate numpy arrays

When doing:

import numpy as np
a = np.array([1,2,3])
b = np.array([4,5,6,7])
print a+b

of course, there is an error:

ValueError: operands could not be broadcast together with shapes (3,) (4,)

Is it possible to make that numpy arrays automatically truncate to the smallest size when two arrays of different sizes are added or multiplied?

Example: here a has length 3 and b has length 4, so we automatically truncate b to length 3 before doing the addition. Desired result for a+b:

[5 7 9]

Can this be done by subclassing np.array?

Remark: I would like to avoid to have to manually truncate all arrays myself with a[:3] + b[:3]. I want to be able to write just a+b.

Upvotes: 3

Views: 2510

Answers (2)

cge
cge

Reputation: 9890

So, to begin with: what you want to do is bad form. Redefining simple operations often causes all manner of headaches. Subclassing np.array for something like this seems like a horrible idea.

With that said, it is possible to do. Here's a naive way to do it:

import numpy as np

class truncarray(np.ndarray):
    def __new__( cls, array ):
        obj = np.asarray(array).view(cls)
        return obj
    def __add__( a, b ):
        s = slice(0, min(len(a),len(b)))
        return np.add(a[s],b[s])
    __radd__ = __add__

a = truncarray([1,2,3])
b = truncarray([4,5,6,7])
a_array = np.array([1,2,3])
b_array = np.array([4,5,6,7])

Now, let's see how much this has messed up everything:

Adding truncates, as you'd prefer:

In [17]: a+b
Out[17]: truncarray([5, 7, 9])

Adding a number no longer works:

In [18]: a_array+1
Out[18]: array([2, 3, 4])

In [19]: a+1
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-19-fdcaab9110f2> in <module>()
----> 1 a+1

<ipython-input-2-3651dc87cb0e> in __add__(a, b)
      4                 return obj
      5         def __add__( a, b ):
----> 6                 s = slice(0, min(len(a),len(b)))
      7                 return np.add(a[s],b[s])
      8         __radd__ = __add__

TypeError: object of type 'int' has no len()

When considering a mixture of truncarrays and arrays, addition is no longer transitive:

In [20]: a+b_array+a_array
Out[20]: truncarray([ 6,  9, 12])

In [21]: b_array+a+a_array
Out[21]: truncarray([ 6,  9, 12])

In [22]: b_array+a_array+a
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-22-bcd145daa775> in <module>()
----> 1 b_array+a_array+a

ValueError: operands could not be broadcast together with shapes (4,) (3,)

In fact, it isn't even associative(!):

In [23]: a+(b_array+a_array)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-23-413ce83f55c2> in <module>()
----> 1 a+(b_array+a_array)

ValueError: operands could not be broadcast together with shapes (4,) (3,)

At the very least, if you do this, you'll want to add handling for differing types. But please consider Anton's answer: it's the far safer way of doing this.

Upvotes: 3

Anton Protopopov
Anton Protopopov

Reputation: 31672

You could slice both arrays to the smaller one and then add them:

min_size = min(a.size, b.size)
c = a[:min_size] + b[:min_size]
print(c)
array([5, 7, 9])

EDIT

If you don't want to do it manually you could write a function:

def add_func(*args):
    to_trunc = min(map(len, args))
    return np.sum([arg[:to_trunc] for arg in args], axis=0)

print(add_func(a,b))
[5 7 9]

Upvotes: 4

Related Questions