abdul rehman
abdul rehman

Reputation: 37

understanding numpy np.tensordot

arr1 = np.arange(8).reshape(4, 2)
arr2 = np.arange(4, 12).reshape(2, 4)
ans=np.tensordot(arr1,arr2,axes=([1],[0]))
ans2=np.tensordot(arr1,arr2,axes=([0],[1]))
ans3 = np.tensordot(arr1,arr2, axes=([1,0],[0,1]))

I am trying to understand how this tensordot function work . I know that it returns the tensordot product.

but axes part is a little bit difficult for me to comprehend. what i have observed that

for ans it is like the number of columns in array arr1 and the number of rows in arr2 makes the final matrix.

for ans2 it is the other way around number of columns in arr2 and number of rows in arr1

i dont understand axes=([1,0],[0,1]). let me know if my understanding for ans and ans2 is correct

Upvotes: 0

Views: 531

Answers (2)

hpaulj
hpaulj

Reputation: 231738

You forgot to show the arrays:

In [87]: arr1
Out[87]: 
array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7]])
In [88]: arr2
Out[88]: 
array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
In [89]: ans
Out[89]: 
array([[  8,   9,  10,  11],
       [ 32,  37,  42,  47],
       [ 56,  65,  74,  83],
       [ 80,  93, 106, 119]])
In [90]: ans2
Out[90]: 
array([[ 76, 124],
       [ 98, 162]])
In [91]: ans3
Out[91]: array(238)

ans is just the regular dot, matrix product:

In [92]: np.dot(arr1,arr2)
Out[92]: 
array([[  8,   9,  10,  11],
       [ 32,  37,  42,  47],
       [ 56,  65,  74,  83],
       [ 80,  93, 106, 119]])

The dot sum-of-products is performed on ([1],[0]) axis 1 of arr1, and axis 0 of arr2 (the conventional across the columns, down the rows). With 2d 'sum across ...' phrase can be confusing. It's clearer when dealing with 1 or 3d arrays. Here the matching size 2 dimensions are summed, leaving the (4,4).

ans2 reverses them, summing on the 4's, producing a (2,2):

In [94]: np.dot(arr2,arr1)
Out[94]: 
array([[ 76,  98],
       [124, 162]])

tensordot has just transposed the 2 arrays and performed a regular dot:

In [95]: np.dot(arr1.T,arr2.T)
Out[95]: 
array([[ 76, 124],
       [ 98, 162]])

ans3 is uses a transpose and reshape (ravel), to sum on both axes:

In [98]: np.dot(arr1.ravel(),arr2.T.ravel())
Out[98]: 238

In general, tensordot uses a mix of transpose and reshape to reduce the problem to a 2d np.dot problem. It may then reshape and transpose the result.

I find the dimensions control of einsum to be clearer:

In [99]: np.einsum('ij,jk->ik',arr1,arr2)
Out[99]: 
array([[  8,   9,  10,  11],
       [ 32,  37,  42,  47],
       [ 56,  65,  74,  83],
       [ 80,  93, 106, 119]])
In [100]: np.einsum('ji,kj->ik',arr1,arr2)
Out[100]: 
array([[ 76, 124],
       [ 98, 162]])
In [101]: np.einsum('ij,ji',arr1,arr2)
Out[101]: 238

With the development of einsum and matmul/@, tensordot has become less necessary. It's harder to understand, and doesn't have any speed or flexibility advantages. Don't worry about understanding it.

ans3 is the trace (sum of diagonal) of the other 2 ans:

In [103]: np.trace(ans)
Out[103]: 238
In [104]: np.trace(ans2)
Out[104]: 238

Upvotes: 1

Jacob Nørgaard
Jacob Nørgaard

Reputation: 21

As far as I understand it from the tensordot documentation, you are supplying a list of axis in ans, ans2 and ans3 (ans and ans2 just only have one element in the list). This list then specifies which axes to summed over. You are right in you assumption of ans and ans2 where in ans your first element is the 0 axis of arr1 (rows in arr1) and the 1 axis in arr2 (the columns in arr2). I am not totally sure what to expect from ans3, but i might try running some examples myself and take a look. I hope this could give you a little better understanding

link: https://numpy.org/doc/stable/reference/generated/numpy.tensordot.html

Upvotes: 1

Related Questions