Reputation: 108
Lets say I have a matrix X (1000x10)
and a matrix Y (20x10)
. I want to efficiently add Y
to every (20x10)
block of X
repeatedly (therefore 50 blocks). Is there an efficient way to do this with numpy? I don't want to use np.repeat
as the original matrices are huge and I want to prevent unnecessary duplication of Y
. Any ideas?
Upvotes: 0
Views: 329
Reputation: 36765
You can leverage argument list unpacking, NumPy broadcasting and the fact that ndarray.reshape()
returns a view to perform the operation:
tmp = X.reshape(-1, *Y.shape)
tmp += Y
No additional data will be allocated and after these operations, X
will contain the result of the operation.
Upvotes: 2
Reputation: 11942
You can use np.tile
to "expand" (more accurately, tile) the smaller array to match the size of the larger array, for example
x = np.zeros([1000,10])
y = np.ones([20,10])
new_x = x + np.tile(y,(50,1))
This will create a temporary large array in-memory to add to x
but is immediately thrown away, so it depends on your memory capacity and the size of the array, but I believe it's the most efficient there is in terms of CPU usage and readability.
The other option is to of course loop over the larger array and broadcast the smaller array to every piece of it (50 times in this case) but it will be more time consuming and less efficient in terms of CPU, but will be lighter on memory.
An example of the 2nd option:
for i in range(0,len(x),20):
x[i:i+20,:] = y
Upvotes: 0