nicktruesdale
nicktruesdale

Reputation: 815

Efficiently remove rows/columns of numpy image array

I am trying to remove rows or columns from an image represented by a Numpy array. My image is of type uint16 and 2560 x 2176. As an example, say I want to remove the first 16 columns to make it 2560 x 2160.

I'm a MATLAB-to-Numpy convert, and in MATLAB would use something like:

A = rand(2560, 2196);
A(:, 1:16) = [];

As I understand, this deletes the columns in place and saves a lot of time by not copying to a new array.

For Numpy, previous posts have used commands like numpy.delete. However, the documentation is clear that this returns a copy, and so I must reassign the copy to A. This seems like it would waste a lot of time copying.

import numpy as np

A = np.random.rand(2560,2196)
A = np.delete(A, np.r_[:16], 1)

Is this truly as fast as an in-place deletion? I feel I must be missing a better method or not understanding how python handles array storage during deletion.

Relevant previous posts:
Removing rows in NumPy efficiently
Documentation for numpy.delete

Upvotes: 2

Views: 7605

Answers (1)

tiago
tiago

Reputation: 23492

Why not just do a slice? Here I'm removing the first 3000 columns instead of 16 to make the memory usage more clear:

import numpy as np
a = np.empty((5000, 5000)
a = a[:, 3000:]

This effectively reduces the size of the array in memory, as can be seen:

In [31]: a = np.zeros((5000, 5000), dtype='d')
In [32]: whos
Variable   Type       Data/Info
-------------------------------
a          ndarray    5000x5000: 25000000 elems, type `float64`, 200000000 bytes (190 Mb)
In [33]: a = a[:, 3000:]
In [34]: whos
Variable   Type       Data/Info
-------------------------------
a          ndarray    5000x2000: 10000000 elems, type `float64`, 80000000 bytes (76 Mb)

For this size of array a slice seems to be about 10,000x faster than your delete option:

%timeit a=np.empty((5000,5000), dtype='d');  a=np.delete(a, np.r_[:3000], 1)
1 loops, best of 3: 404 ms per loop
%timeit a=np.empty((5000,5000), dtype='d');  a=a[:, 3000:]
10000 loops, best of 3: 39.3 us per loop

Upvotes: 6

Related Questions