Reputation: 3648

How to apply a function on jagged Numpy arrays (unequal row lengths) without using np.apply_along_axis()?

I'm trying to speed up a process, I think this might be possible using numpy's apply_along_axis. The problem is that not all my axis have the same length.

When I do:

a = np.array([[1, 2, 3], 
              [2, 3, 4], 
              [4, 5, 6]])
b = np.apply_along_axis(sum, 1, a)
print(b)

This works fine. But I would like to do something similar to (please note that the first row has 4 elements and the rest have 3):

a = np.array([[1, 2, 3, 4], 
              [2, 3, 4], 
              [4, 5, 6]])
b = np.apply_along_axis(sum, 1, a)
print(b)

But this fails because:

numpy.AxisError: axis 1 is out of bounds for array of dimension 1

I've looked around and the only 'solution' I've found is to add zeros to make all the arrays the same length, which would probably defeat the purpose of performance improvement.

Is there any way to use numpy_apply_along_axis on a non-regular shaped numpy array?

Upvotes: 0

Answers (2)

Joe

Reputation: 7131

It depends. Do you know the size of the vectors before or are you appending to a list?

see e.g. http://stackoverflow.com/a/58085045/7919597

You could for example pad the arrays

import numpy as np

a1 = [1, 2, 3, 4]
a2 = [2, 3, 4, np.nan] # pad with nan
a3 = [4, 5, 6, np.nan] # pad with nan

b = np.stack([a1, a2, a3], axis=0)

print(b)

# you can apply the normal numpy operations on 
# arrays with nan, they usually just result in a nan
# in a resulting array
c = np.diff(b, axis=-1)

print(c)

Afterwards you can apply a moving window on each row over the columns.

Have a look at https://stackoverflow.com/a/22621523/7919597 which is only 1d, but can give you an idea of how it could work.

It is possible to use a 2d array with only one row as kernel (shape e.g. (1, 3)) with scipy.signal.convolve2d and use the idea above. This is a workaround to get a "row-wise 1D convolution":

from scipy import signal

krnl = np.array([[0, 1, 0]])

d = signal.convolve2d(c, krnl, mode='same')
print(d)

Upvotes: 1

Eduard Ilyasov

Reputation: 3308

You can transform your initial array of iterable-objects to ndarray by padding them with zeros in a vectorized manner:

import numpy as np

a = np.array([[1, 2, 3, 4], 
              [2, 3, 4], 
              [4, 5, 6]])
max_len = len(max(a, key = lambda x: len(x))) # max length of iterable-objects contained in array
cust_func = np.vectorize(pyfunc=lambda x: np.pad(array=x, 
                                                 pad_width=(0,max_len), 
                                                 mode='constant', 
                                                 constant_values=(0,0))[:max_len], otypes=[list])
a_pad = np.stack(cust_func(a))

output:

array([[1, 2, 3, 4],
       [2, 3, 4, 0],
       [4, 5, 6, 0]])

Upvotes: 1

How to apply a function on jagged Numpy arrays (unequal row lengths) without using np.apply_along_axis()?

Answers (2)

Related Questions