Reputation: 155
I am trying to take the dot product between three numpy arrays. However, I am struggling with wrapping my head around this.
I have two (4,) shaped numpy arrays a
and b
respectively, as well as a numpy array with shape (4, 4, 3), c
.
import numpy as np
a = np.array([0, 1, 2, 3])
b = np.array([[[1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1]],
[[2, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2]],
[[3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3]],
[[4, 4, 4], [4, 4, 4], [4, 4, 4], [4, 4, 4]]])
c = np.array([4, 5, 6, 7])
I want to compute the dot product in such a way that my result is a 3-tuple. That is, first dot a
with b
and then dotting with c
, taking transposes if needed. In other words, I want to compute the dot product between a
, b
and c
as if c
was of shape (4, 4)
, but I want a 3-tuple as result.
Reshaping a
and c
, and then computing the dot product:
a = np.reshape(a, (4, 1))
c = np.reshape(c, (4, 1))
tmp = np.dot(a.T, b) # now has shape (1, 4, 3)
result = np.dot(tmp, c)
Ideally, I should now have:
print(result.shape)
>> (1, 1, 3)
but I get the error
ValueError: shapes (1,4,3) and (4,1) not aligned: 3 (dim 2) != 4 (dim 0)
I have also tried using the tensordot function from numpy, but without luck.
Upvotes: 0
Views: 1319
Reputation: 231325
The basic dot(A,B)
rule is: last axis of A with the 2nd to the last of B
In [965]: a.shape
Out[965]: (4,)
In [966]: b.shape
Out[966]: (4, 4, 3)
a
(and c
) is 1d. It's (4,) can dot with the 2nd (4) of b
with:
In [967]: np.dot(a,b).shape
Out[967]: (4, 3)
Using c
in the same on the output produces a (3,) array
In [968]: np.dot(c, np.dot(a,b))
Out[968]: array([360, 360, 360])
This combination may be clearer with the equivalent einsum
:
In [971]: np.einsum('i,jik,j->k',a,b,c)
Out[971]: array([360, 360, 360])
But what if we want a
to act on the 1st axis of b
? With einsum
that's easy to do:
In [972]: np.einsum('i,ijk,j->k',a,b,c)
Out[972]: array([440, 440, 440])
To do the same with the dot
, we could just switch a
and c
:
In [973]: np.dot(a, np.dot(c,b))
Out[973]: array([440, 440, 440])
Or transpose axes of b
:
In [974]: np.dot(c, np.dot(a,b.transpose(1,0,2)))
Out[974]: array([440, 440, 440])
This transposition question would be clearer if a
and c
had different lengths. e.g. A (2,) and (4,) with a (2,4,3) or (4,2,3).
In
tmp = np.dot(a.T, b) # now has shape (1, 4, 3)
you have a (1,4a) dotted with (4,4a,3). The result is (1,4,3). I added the a
to identify when axes were combined.
To apply the (4,1) c
, we have to do the same transpose:
In [977]: np.dot(c[:,None].T, np.dot(a[:,None].T, b))
Out[977]: array([[[360, 360, 360]]])
In [978]: _.shape
Out[978]: (1, 1, 3)
np.dot(c[None,:], np.dot(a[None,:], b))
would do the same without the transposes.
I was hoping numpy would automagically distribute over the last axis. That is, that the dot product would run over the two first axes, if that makes sense.
Given the dot
rule that I cited at the start this does not make sense. But if we transpose b
so the (3) axis is first, it can 'carry that along', using the last and 2nd to the last.
In [986]: b.transpose(2,0,1).shape
Out[986]: (3, 4, 4)
In [987]: np.dot(a, b.transpose(2,0,1)).shape
Out[987]: (3, 4)
In [988]: np.dot(np.dot(a, b.transpose(2,0,1)),c)
Out[988]: array([440, 440, 440])
(4a).(3, 4a, 4c) -> (3, 4c)
(3, 4c). (4c) -> 3
Upvotes: 1
Reputation: 734
You are multiplying (1,4,3) matrix by (4,1) matrix so it is impossible because you have 3 pages of (1,4) matrices in b. If you want to do multiplication of each page of matrix b by c just multiply each page separately.
a = np.array([0, 1, 2, 3])
b = np.array([[[1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1]],
[[2, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2]],
[[3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3]],
[[4, 4, 4], [4, 4, 4], [4, 4, 4], [4, 4, 4]]])
c = np.array([4, 5, 6, 7])
a = np.reshape(a, (4, 1))
c = np.reshape(c, (4, 1))
tmp = np.dot(a.T, b) # now has shape (1, 4, 3)
result = np.dot(tmp[:,:,0], c)
for i in range(1,3):
result = np.dstack((result, np.dot(tmp[:,:,i], c)))
print np.shape(result)
So you have result of size (1,1,3)
Upvotes: 0
Reputation: 53029
Not automagical but does the job:
np.einsum('i,ijk,j->k',a,b,c)
# array([440, 440, 440])
This computes d
of shape (3,) such that d_k = sum_{ij} a_i b_{ijk} c_j
.
Upvotes: 0