Reputation: 815
I am trying to remove rows or columns from an image represented by a Numpy array. My image is of type uint16 and 2560 x 2176. As an example, say I want to remove the first 16 columns to make it 2560 x 2160.
I'm a MATLAB-to-Numpy convert, and in MATLAB would use something like:
A = rand(2560, 2196);
A(:, 1:16) = [];
As I understand, this deletes the columns in place and saves a lot of time by not copying to a new array.
For Numpy, previous posts have used commands like numpy.delete
. However, the documentation is clear that this returns a copy, and so I must reassign the copy to A. This seems like it would waste a lot of time copying.
import numpy as np
A = np.random.rand(2560,2196)
A = np.delete(A, np.r_[:16], 1)
Is this truly as fast as an in-place deletion? I feel I must be missing a better method or not understanding how python handles array storage during deletion.
Relevant previous posts:
Removing rows in NumPy efficiently
Documentation for numpy.delete
Upvotes: 2
Views: 7605
Reputation: 23492
Why not just do a slice? Here I'm removing the first 3000 columns instead of 16 to make the memory usage more clear:
import numpy as np
a = np.empty((5000, 5000)
a = a[:, 3000:]
This effectively reduces the size of the array in memory, as can be seen:
In [31]: a = np.zeros((5000, 5000), dtype='d')
In [32]: whos
Variable Type Data/Info
-------------------------------
a ndarray 5000x5000: 25000000 elems, type `float64`, 200000000 bytes (190 Mb)
In [33]: a = a[:, 3000:]
In [34]: whos
Variable Type Data/Info
-------------------------------
a ndarray 5000x2000: 10000000 elems, type `float64`, 80000000 bytes (76 Mb)
For this size of array a slice seems to be about 10,000x faster than your delete option:
%timeit a=np.empty((5000,5000), dtype='d'); a=np.delete(a, np.r_[:3000], 1)
1 loops, best of 3: 404 ms per loop
%timeit a=np.empty((5000,5000), dtype='d'); a=a[:, 3000:]
10000 loops, best of 3: 39.3 us per loop
Upvotes: 6