J.Todd
J.Todd

Reputation: 827

Is there any negative effect for using cropped images with the TFRecord format?

The TensorFlow Object Detection API requires TFRecord image cropping properties, like so:

{
  'image/height': 1800,
  'image/width': 2400,
  'image/filename': 'image1.jpg',
  'image/source_id': 'image1.jpg',
  'image/encoded': ACTUAL_ENCODED_IMAGE_DATA_AS_BYTES,
  'image/format': 'jpeg',
  'image/object/bbox/xmin': [0.7255949630314233, 0.8845598428835489],
  'image/object/bbox/xmax': [0.9695875693160814, 1.0000000000000000],
  'image/object/bbox/ymin': [0.5820120073891626, 0.1829972290640394],
  'image/object/bbox/ymax': [1.0000000000000000, 0.9662484605911330],
  'image/object/class/text': (['Cat', 'Dog']),
  'image/object/class/label': ([1, 2])
}

But I have a data-set of pre-cropped images (as in each image only shows the object to be classified). Would there be any downside in the training process of providing pre-cropped image data with xmin ymin of 0 and xmax ymax of the cropped image size? My main concern is of whether or not the training system might otherwise use contextual data nearby the cropped selections.

My question would probably be better phrased as "Do TensorFlow models potentially use contextual details nearby locations selected in TFRecord files for training?"

Upvotes: 0

Views: 149

Answers (1)

GPhilo
GPhilo

Reputation: 19143

No, you definitely need the bounding box information to train an object detector. Do they potentially use contextual information? Maybe, but it's a learned behaviour.

You need to provide images where your objects are visible in multiple backgrounds, scales, illuminations, etc in order to train a network to robustly detect an object. If you only have pre-cropped images all you can train is an image classifier, the detection part won't learn anything useful from those images.

Upvotes: 1

Related Questions