Reputation: 101
I have images of a big size (6000x4000). I want to train FasterRCNN to detect quite small object (tipycally between 50 150 pixels). So for memory purpose I crop the images to 1000x1000. The training is ok. When I test the model on the 1000x1000 the results are really good. When I test the model on images of 6000x4000 the result are really bad...
I guess it is the region proposal step, but I don't know what I am doing wrong (the keep_aspect_ratio_resizer max_dimension is fix to 12000)...
Thanks for your help !
Upvotes: 2
Views: 2765
Reputation: 48
I want to know what is your min_dimension which should be larger than 4000 in your case, otherwise the image will be scale down.
object_detection-> core-> preprocessor.py
def _compute_new_dynamic_size(image, min_dimension, max_dimension):
"""Compute new dynamic shape for resize_to_range method."""
image_shape = tf.shape(image)
orig_height = tf.to_float(image_shape[0])
orig_width = tf.to_float(image_shape[1])
orig_min_dim = tf.minimum(orig_height, orig_width)
# Calculates the larger of the possible sizes
min_dimension = tf.constant(min_dimension, dtype=tf.float32)
large_scale_factor = min_dimension / orig_min_dim
# Scaling orig_(height|width) by large_scale_factor will make the smaller
# dimension equal to min_dimension, save for floating point rounding errors.
# For reasonably-sized images, taking the nearest integer will reliably
# eliminate this error.
large_height = tf.to_int32(tf.round(orig_height * large_scale_factor))
large_width = tf.to_int32(tf.round(orig_width * large_scale_factor))
large_size = tf.stack([large_height, large_width])
if max_dimension:
# Calculates the smaller of the possible sizes, use that if the larger
# is too big.
orig_max_dim = tf.maximum(orig_height, orig_width)
max_dimension = tf.constant(max_dimension, dtype=tf.float32)
small_scale_factor = max_dimension / orig_max_dim
# Scaling orig_(height|width) by small_scale_factor will make the larger
# dimension equal to max_dimension, save for floating point rounding
# errors. For reasonably-sized images, taking the nearest integer will
# reliably eliminate this error.
small_height = tf.to_int32(tf.round(orig_height * small_scale_factor))
small_width = tf.to_int32(tf.round(orig_width * small_scale_factor))
small_size = tf.stack([small_height, small_width])
new_size = tf.cond(
tf.to_float(tf.reduce_max(large_size)) > max_dimension,
lambda: small_size, lambda: large_size)
else:
new_size = large_size
return new_size
Upvotes: 0
Reputation: 3443
You need to keep training images and images to test on of roughly same dimension. If you are using random resizing as data augmentation, you can vary the test images by roughly that factor.
Best way to deal with this problem is to crop large image into images of same dimension as used in training and then use Non-maximum suppression on crops to merge the prediction.
That way, If your smallest object to detect is of size 50px, you can have training images of size ~500px.
Upvotes: 3
Reputation: 1558
It looks to me like you are training on images with a different aspect ratio than what you are testing on (square vs not square) --- this could lead to a significant degradation in quality.
Though to be honest I'm a bit surprised that the results could be really bad, if you are just visually evaluating, maybe you also have to turn down the score thresholds for visualization.
Upvotes: 1