Reputation: 40918
I'm probably not seeing something obvious here but don't believe np.apply_along_axis
or np.apply_over_axes
is what I'm looking for. Say I have the following two arrays:
arr1 = np.random.randn(10, 5)
arr2 = np.random.randn(10, )
And the following function:
def coefs(x, y):
return np.dot(np.linalg.inv(np.dot(x.T, x)), np.dot(x.T, y))
# the vector of coefficients in a multiple linear regression
Calling this on arr1
and arr2
works smoothly as it should:
coefs(arr1, arr2)
Out[111]: array([-0.19474836, -0.50797551, 0.82903805, 0.06332607, -0.26985597])
However, suppose instead of the 1- or 2d arrays I have two 3d arrays:
arr3 = np.array([arr1[:-1], arr1[1:]])
arr4 = np.array([arr2[:-1], arr2[1:]])
As expected, if I apply the function here I get
coefs(arr3, arr4)
Traceback (most recent call last):
File "<ipython-input-127-4a3e7df02cda>", line 1, in <module>
coefs(arr3, arr4)
File "<ipython-input-124-7532b8516784>", line 2, in coefs
return np.dot(np.linalg.inv(np.dot(x.T, x)), np.dot(x.T, y))
ValueError: shapes (5,9,2) and (2,9,5) not aligned: 2 (dim 2) != 9 (dim 1)
...because NumPy is treating each array as an object as it should. What I want to do instead is apply the coefs()
function to each of the 2 elements along the 0 axis of the arrays, element-wise. Here's a crude way of doing this:
tgt = []
for i, j in zip(arr3, arr4):
tgt.append(coefs(i, j))
np.array(tgt)
Out[136]:
array([[-0.34328006, -0.99116672, 1.42757897, -0.06687851, -0.44669182],
[ 0.44494495, -0.58017705, 0.75825944, 0.18795889, 0.4560851 ]])
My question is, is there a more efficient and pythonic way of doing this than using zip and iterating over, as above? Basically, given two input arrays of shape (2, n, k) and (2, n), I want the array that is returned to be of shape (2, k). Thanks.
Upvotes: 1
Views: 1775
Reputation: 221674
For generic shaped 3D
and 2D
arrays - arr3
and arr4
, we can use some np.einsum
magic to have a vectorized solution, like so -
dot1 = np.einsum('ijk,ijl->ikl',arr3,arr3)
dot2 = np.einsum('ijk,ij->ik',arr3,arr4)
inv1 = np.linalg.inv(dot1)
tgt_out = np.einsum('ijk,ij->ik',inv1, dot2)
Runtime test
Approaches -
def org_app(arr3, arr4):
tgt = []
for i, j in zip(arr3, arr4):
tgt.append(coefs(i, j))
return np.array(tgt)
def einsum_app(arr3, arr4):
dot1 = np.einsum('ijk,ijl->ikl',arr3,arr3)
dot2 = np.einsum('ijk,ij->ik',arr3,arr4)
inv1 = np.linalg.inv(dot1)
return np.einsum('ijk,ij->ik',inv1, dot2)
Timings and verification -
In [215]: arr3 = np.random.rand(50,50,50)
...: arr4 = np.random.rand(50,50)
...:
In [216]: np.allclose(org_app(arr3, arr4), einsum_app(arr3, arr4))
Out[216]: True
In [217]: %timeit org_app(arr3, arr4)
100 loops, best of 3: 4.82 ms per loop
In [218]: %timeit einsum_app(arr3, arr4)
100 loops, best of 3: 19.7 ms per loop
Doesn't look like einsum
is giving us any benefits here. This is expected because basically einsum
is fighting it out against np.dot
, which is much better at sum-reduction
and even though we are using it in a loop. The only situation/case in which we can give np.dot
a fight is when we loop enough and that should make einsum
competitive. We are looping for times equal to the length equal of the first axis of the input arrays. Let's increase it and test again -
In [219]: arr3 = np.random.rand(1000,10,10)
...: arr4 = np.random.rand(1000,10)
...:
In [220]: %timeit org_app(arr3, arr4)
10 loops, best of 3: 23 ms per loop
In [221]: %timeit einsum_app(arr3, arr4)
100 loops, best of 3: 9.1 ms per loop
einsum
definitely winning on this one!
This related post
on the fight between np.einsum
and np.dot
is worth a look.
Also, note that if we need to use the loop based approach, we should look to initialize the output array and then assign the output values from coefs
into it rather than appending, as the latter is a slow process.
Upvotes: 1