orange
orange

Reputation: 8090

Leaving dimension unaffected with numpy tensordot

I've been using np.tensordot in the past without any problems, but my current example, I struggle to understand the result.

For np.tensordot(d * y, r, axes=((1, 2, 3), (2, 3, 4))).shape, I would expect a shape of (6, 5), but instead, I get (6, 6, 5). When I would run tensordot 6 times on axis0, I would however get the expected result, but I'd rather have tensordot do this for me in one call. What's wrong with this?

>>> import numpy as np
>>> d = np.random.rand(6, 7, 1, 2)
>>> y = np.random.rand(6, 7, 1, 2)
>>> r = np.random.rand(6, 5, 7, 1, 2) > 0.5
>>> 
>>> np.tensordot(d * y, r, axes=((1, 2, 3), (2, 3, 4))).shape
(6, 6, 5)
>>> np.tensordot((d * y)[0], r[0], axes=((0, 1, 2), (1, 2, 3))).shape
(5,)
>>> np.tensordot((d * y)[1], r[1], axes=((0, 1, 2), (1, 2, 3))).shape
(5,)
>>> np.tensordot((d * y)[2], r[2], axes=((0, 1, 2), (1, 2, 3))).shape
(5,)
...
>>> np.tensordot((d * y)[5], r[5], axes=((0, 1, 2), (1, 2, 3))).shape
(5,)

Upvotes: 0

Views: 458

Answers (2)

hpaulj
hpaulj

Reputation: 231385

Consider a simpler case:

In [709]: d=np.ones((6,2));
In [710]: np.tensordot(d,d,axes=(1,1)).shape
Out[710]: (6, 6)

This is equivalent to:

In [712]: np.einsum('ij,kj->ik',d,d).shape
Out[712]: (6, 6)

This isn't ij,ij->i. It's a outer product on the unlisted axes, not an element by element one.

You have (6, 7, 1, 2) and (6, 5, 7, 1, 2), and want to sum on (7,1,2). It's doing an outer product on the (6) and (6,5).

np.einsum('i...,ij...->ij',d,r) would do, I think, what you want.

Under the covers, tensordot reshapes and swaps axes so that the problem becomes a 2d np.dot call. Then it reshapes and swaps back as needed.


Correction; I can't use ellipsis for the 'dotted' dimensions

In [726]: np.einsum('aijk,abijk->ab',d,r).shape
Out[726]: (6, 5)

and method:

In [729]: (d[:,None,...]*r).sum(axis=(2,3,4)).shape
Out[729]: (6, 5)

timings

In [734]: timeit [np.tensordot(d[i],r[i], axes=((0, 1, 2), (1, 2, 3))) for i in 
     ...: range(6)]
145 µs ± 514 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [735]: timeit np.einsum('aijk,abijk->ab',d,r)
7.22 µs ± 34.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [736]: timeit (d[:,None,...]*r).sum(axis=(2,3,4))
16.6 µs ± 84.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Another solution, using the @ (matmul) operator

In [747]: timeit np.squeeze(d.reshape(6,1,14)@r.reshape(6,5,14).transpose(0,2,1))
11.4 µs ± 28.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Upvotes: 1

Rufflewind
Rufflewind

Reputation: 8956

In tensordot, every axis is either:

  • summed over (contracted) and thus eliminated from the final result, or
  • unaffected (and unconstrained) and appears exactly once in the summation.

So when you write tensordot(d * y, r, axes=((1, 2, 3), (2, 3, 4))), you are calculating:

T[i j k] = ∑[l m n] dy[i l m n] r[j k l m n]

where dy ≡ d * y. What you want to calculate is

T[i k] = ∑[l m n] dy[i l m n] r[i k l m n]

Notice that i appears twice, yet isn't being summed over. This means i is actually constrained here (think of it as an implicit Kronecker delta). Hence, this isn't something that tensordot can do all by itself.

The simplest way is to use einsum and just explicitly declare what you want:

np.einsum("i l m n, i k l m n -> i k", d * y, r)

Since einsum can see the entire expression you're trying to calculate, it should be able to find a relatively optimal way to perform this calculation.

Upvotes: 1

Related Questions