Reputation: 4270
I have a number of images that I want to crop, then reshape. To help me with this I have written two helper functions:
def crop_images(images_data):
cropped_images = []
for image_data in images_data:
image = Image.fromarray(image_data)
cropped_image = np.asarray(image.crop((25,40,275,120)))
cropped_images.append(cropped_image)
return(np.array(cropped_images))
def resize_images(images_data):
resized_images = []
width, height = images_data.shape[2], images_data.shape[1]
resized_width, resized_height = int(width/2), int(height/2)
for image_data in images_data:
image = Image.fromarray(image_data)
image = image.resize((resized_width, resized_height), Image.ANTIALIAS)
resized_images.append(np.asarray(image))
return(np.array(resized_images))
Then I would just chain the two functions together to process my images like:
resize_images(crop_images(images_data))
But I was wondering whether there is a way to vectorize these operation as I know that numpy
should ideally be vectorized operations, as it is faster.
Upvotes: 2
Views: 922
Reputation: 231335
This is a higher level of iteration - over image arrays - where the usual talk about 'vectorizing' is not as applicable.
Image arrays tend to have size like (400,400,3) or bigger. You don't want to iterate of one of those 400 sides if you don't have to. So 'vectorizing' operations on image arrays makes a lot of sense.
But if processing 100 of these images, a loop over images isn't so bad. The only way to 'vectorize' is to assemble them into a larger array (N, 400, 400, 3) and find expressions that work on 4d, or slices of that big one. It's tempting to go that route if N is 1000 or more, but for a big array like that memory management issues start chewing into any speed gains.
For iteration, I think appending to list and inserting into a preallocated array are both useful. I haven't seen clear evidence that one is faster than the other in all cases.
alist = []
for arr in source:
<process arr>
alist.append(arr)
bigarr = np.array(alist)
versus
bigarr = np.zeros((N,..)
for i in range(N):
arr = source[i,...]
<process arr>
bigarr[i,...] = arr
Code clarity can also suffer when trying to 'vectorize' batch operations.
Upvotes: 1
Reputation: 6004
For cropping, if you put all the images into one 3-D array, then you can crop them all in one shot (third dimension is images axis):
cropped = images[top:bottom, left:right, :]
Not sure if that would be faster though - the memory cost of having all the images twice in the memory could slow it down.
Upvotes: 1