understanding numpy np.tensordot

Question

arr1 = np.arange(8).reshape(4, 2)
arr2 = np.arange(4, 12).reshape(2, 4)
ans=np.tensordot(arr1,arr2,axes=([1],[0]))
ans2=np.tensordot(arr1,arr2,axes=([0],[1]))
ans3 = np.tensordot(arr1,arr2, axes=([1,0],[0,1]))

I am trying to understand how this tensordot function work . I know that it returns the tensordot product.

but axes part is a little bit difficult for me to comprehend. what i have observed that

for ans it is like the number of columns in array arr1 and the number of rows in arr2 makes the final matrix.

for ans2 it is the other way around number of columns in arr2 and number of rows in arr1

i dont understand axes=([1,0],[0,1]). let me know if my understanding for ans and ans2 is correct

hpaulj · Accepted Answer

You forgot to show the arrays:

In [87]: arr1
Out[87]: 
array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7]])
In [88]: arr2
Out[88]: 
array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
In [89]: ans
Out[89]: 
array([[  8,   9,  10,  11],
       [ 32,  37,  42,  47],
       [ 56,  65,  74,  83],
       [ 80,  93, 106, 119]])
In [90]: ans2
Out[90]: 
array([[ 76, 124],
       [ 98, 162]])
In [91]: ans3
Out[91]: array(238)

ans is just the regular dot, matrix product:

In [92]: np.dot(arr1,arr2)
Out[92]: 
array([[  8,   9,  10,  11],
       [ 32,  37,  42,  47],
       [ 56,  65,  74,  83],
       [ 80,  93, 106, 119]])

The dot sum-of-products is performed on ([1],[0]) axis 1 of arr1, and axis 0 of arr2 (the conventional across the columns, down the rows). With 2d 'sum across ...' phrase can be confusing. It's clearer when dealing with 1 or 3d arrays. Here the matching size 2 dimensions are summed, leaving the (4,4).

ans2 reverses them, summing on the 4's, producing a (2,2):

In [94]: np.dot(arr2,arr1)
Out[94]: 
array([[ 76,  98],
       [124, 162]])

tensordot has just transposed the 2 arrays and performed a regular dot:

In [95]: np.dot(arr1.T,arr2.T)
Out[95]: 
array([[ 76, 124],
       [ 98, 162]])

ans3 is uses a transpose and reshape (ravel), to sum on both axes:

In [98]: np.dot(arr1.ravel(),arr2.T.ravel())
Out[98]: 238

In general, tensordot uses a mix of transpose and reshape to reduce the problem to a 2d np.dot problem. It may then reshape and transpose the result.

I find the dimensions control of einsum to be clearer:

In [99]: np.einsum('ij,jk->ik',arr1,arr2)
Out[99]: 
array([[  8,   9,  10,  11],
       [ 32,  37,  42,  47],
       [ 56,  65,  74,  83],
       [ 80,  93, 106, 119]])
In [100]: np.einsum('ji,kj->ik',arr1,arr2)
Out[100]: 
array([[ 76, 124],
       [ 98, 162]])
In [101]: np.einsum('ij,ji',arr1,arr2)
Out[101]: 238

With the development of einsum and matmul/@, tensordot has become less necessary. It's harder to understand, and doesn't have any speed or flexibility advantages. Don't worry about understanding it.

ans3 is the trace (sum of diagonal) of the other 2 ans:

In [103]: np.trace(ans)
Out[103]: 238
In [104]: np.trace(ans2)
Out[104]: 238

understanding numpy np.tensordot

Answers (2)

Related Questions