Reputation: 849
The following is an example code which compute array B
from A
:
import numpy as np
idx1 = np.array([
[3, 0, 0],
[2, 1, 0],
[2, 0, 1],
[1, 2, 0],
[1, 1, 1],
[1, 0, 2],
[0, 3, 0],
[0, 2, 1],
[0, 1, 2],
[0, 0, 3]])
idx2 = np.arange(3)
A = np.arange(10*4*3).reshape(10, 4, 3)
B = np.prod(A[:, idx1, idx2], axis=2)
Notice the line
B = np.prod(A[:, idx1, idx2], axis=2)
Is this line memory efficent? Or does numpy
will generate some internal array for A[:, idx1, idx2]
?
One can image that if len(A)
is very large, and numpy
generate some internal array for A[:, idx1, idx2]
, it is not memory efficient. Does there exist any better way to do such thing?
Upvotes: 1
Views: 75
Reputation: 231738
This expression is parsed and evaluated by the Python interpreter:
B = np.prod(A[:, idx1, idx2], axis=2)
first it does
temp = A[:, idx1, idx2] # expands to:
temp = A.__getitem__(slice(None), idx1, idx2)
Since idx1
, idx2
are arrays, this is advanced indexing
, and temp
is a copy, not a view.
Next the interpret executes:
np.prod(temp, axis=2)
that is, it passes temporary array to the prod
function, which then returns an array, which is assigned to the B
variable.
I don't know how much buffering prod
does. I can imagine it setting up a nditer
(c-api version) that takes two operand arrays, the temp
and an output of the right shape (temp.shape(:-1)
assuming the sum is on the last dimension of temp
). See the reduction
section of the docs that I cited in The `out` arguments in `numpy.einsum` can not work as expected.
In sum, Python, when evaluating a function, first evaluates all the arguments, and then passes them to the function. Evaluation of lists can be delayed by using generators, but there isn't an equivalent for numpy arrays.
Upvotes: 2