Brans
Brans

Reputation: 649

Effectively take batch of images and crop rectangular slices of images in numpy or tensorflow (batch image cropping)

I want to find a way to effectively do batch image cropping. The input image is the same. Each crop has different input offsets, height, and width.

Naive code:

img = np.zeros([100, 100, 3])
ofsets_x = np.array([10, 15, 18])
img_w = np.array([10, 12, 15])
ofsets_y = np.array([20, 22, 14])
img_h = np.array([14, 12, 16])

crops= []
for i in range(ofsets_x.shape[0]):
    ofset_x = ofsets_x[i]
    ofset_y = ofsets_y[i]
    w = img_w[i]
    h = img_h[i]

    crop = img[ofsets_x:ofsets_x + w, ofsets_y:ofsets_y + h, :] 
    crops.append(crop)

Because of this works very slow both in numpy and tensorflow(in tensorflow I am resizing each crop in the end of the loop to the specific size with tf.image.resize). In tensorflow I have tried also tf.vectorized_map and tf.while_loop - didn't gave me any significant speed boost. All this > 20x more slow then in C++. Crop is a simple memcpy. It should be superfast especially with preallocated memory.

How to this faster in numpy or tensorflow?

Upvotes: 1

Views: 739

Answers (1)

I'mahdi
I'mahdi

Reputation: 24049

I write code in as you tag in your question. I create a datase with tf.data.Dataset and use map. I check for 5_000 images and get 751 ms for cropping and resizing the images. (Because I check code in colab and have low ram only check run_time for 5_000 images). I repeat each image three times and set the number of indexes in the dataset for using parallelism and select from ofsets for cropping then resizing.

Creating image dataset for testing benchmark:

import numpy as np
import tensorflow as tf
num_imgs = 5_000
len_ofset = 3
img = np.random.rand(num_imgs, 100, 100, 3)
img_dataset = tf.data.Dataset.from_tensor_slices((np.tile(img, (len_ofset,1,1,1)), 
                                                  np.repeat(np.arange(len_ofset), num_imgs)))


ofsets_x = np.array([10, 15, 18])
img_w = np.array([10, 12, 15])
ofsets_y = np.array([20, 22, 14])
img_h = np.array([14, 12, 16])

# converting ofsets to tensor for using in tf.function
tns_ofsets_x = tf.convert_to_tensor(ofsets_x)
tns_img_w = tf.convert_to_tensor(img_w)
tns_ofsets_y = tf.convert_to_tensor(ofsets_y)
tns_img_h = tf.convert_to_tensor(img_h)

Benchmark in colab: (suppose you want to resize images to (16,16))

%%time
size_resize = 16
def crop_resize(img, idx_crop):
    ofset_x = tns_ofsets_x[idx_crop]
    ofset_y = tns_ofsets_y[idx_crop]
    w = tns_img_w[idx_crop]
    h = tns_img_h[idx_crop]
    img = img[ofset_x:ofset_x + w, ofset_y:ofset_y + h, :] 
    img = tf.image.resize(img, (size_resize, size_resize))
    return img

img_dataset = img_dataset.map(
    map_func = crop_resize,
    num_parallel_calls=tf.data.AUTOTUNE
    )

next(iter(img_dataset.take(1))).shape
# TensorShape([16, 16, 3])

Output:

CPU times: user 714 ms, sys: 2.07 s, total: 2.78 s
Wall time: 3.64 s

Upvotes: 2

Related Questions