Reputation: 2884
I previously asked this question, but after some investigation of the problem it appears I may have just gone down the wrong path for what I am trying to achieve.
Dynamic image cropping in Tensorflow
I thought maybe this might be a better path to try. But the part I can't figure out is what I should put for the size parameter on the slice operation. Fundamentally, what I am trying to achieve is having the capability to dynamically decide how to crop an image and then crop it and then continue with those cropped images in my computation graph. Feel free to offer an alternative if this seems to be an inefficient way to go about this.
import numpy as np
import tensorflow as tf
img1 = np.random.random([400, 600, 3])
img2 = np.random.random([400, 600, 3])
img3 = np.random.random([400, 600, 3])
images = [img1, img2, img3]
img1_crop = [100, 100, 100, 100]
img2_crop = [200, 150, 100, 100]
img3_crop = [150, 200, 100, 100]
crop_values = [img1_crop, img2_crop, img3_crop]
x = tf.placeholder(tf.float32, shape=[None, 400, 600, 3])
i = tf.placeholder(tf.int32, shape=[None, 4])
y = tf.slice(x, i, size="Not sure what to put here")
# initialize
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
# run
result = sess.run(y, feed_dict={x: images, i: crop_values})
print(result)
Upvotes: 1
Views: 1408
Reputation: 639
Instead of using tf.slice (which doesn't let you operate on a batch), I recommend using tf.image.extract_glimpse
. Here is a toy sample program that operates in a batch:
import tensorflow as tf
import numpy as np
NUM_IMAGES = 2
NUM_CHANNELS = 1
CROP_SIZE = [3, 4]
IMG_HEIGHT=10
IMG_WIDTH=10
# Fake input data, but ordered so we can look at the printed values and
# map them back. The values of the first image are:
# array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
# [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
# [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
# [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
# [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
# [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
# [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
# [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
# [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])
image1 = np.reshape(
np.array(xrange(NUM_IMAGES * IMG_HEIGHT * IMG_WIDTH * NUM_CHANNELS)),
[NUM_IMAGES, IMG_HEIGHT, IMG_WIDTH, NUM_CHANNELS])
# We use normalized=False to use pixel indexing.
# normalized=True means centers are specified between [0,1).
image1_center = [0, 0] # The center of the crop is ~ the center of the image.
image2_center = [3, 5] # Offset down & right in the image.
img = tf.placeholder(tf.float32, shape=[NUM_IMAGES, IMG_HEIGHT, IMG_WIDTH, NUM_CHANNELS], name="img")
size = tf.placeholder(tf.int32, shape=[2], name="crop_size")
centers = tf.placeholder(tf.float32, shape=[NUM_IMAGES, 2], name="centers")
output = tf.image.extract_glimpse(img, size, centers, normalized=False)
sess = tf.Session()
feed_dict = {
img: image1,
size: CROP_SIZE,
centers: [image1_center, image2_center],
}
print sess.run(output, feed_dict=feed_dict)
If you would like to extract multiple sizes (and even multiple glimpses per image), check out tf.image.crop_and_resize
.
Docs: https://www.tensorflow.org/api_docs/python/image/cropping#extract_glimpse
Upvotes: 1