rwallace
rwallace

Reputation: 33365

Detecting fixed size objects in variable sized images

Neural networks can be trained to recognize an object, then detect occurrences of that object in an image, regardless of their position and apparent size. An example of doing this in PyTorch is at https://towardsdatascience.com/object-detection-and-tracking-in-pytorch-b3cf1a696a98

As the text observes,

Most of the code deals with resizing the image to a 416px square while maintaining its aspect ratio and padding the overflow.

So the idea is that the model always deals with 416px images, both in training and in the actual object detection. Detected objects, being only part of the image, will typically be smaller than 416px, but that's okay because the model has been trained to detect patterns in a scale-invariant way. The only thing fixed is the size in pixels of the input image.

I'm looking at a context in which it is necessary to do the reverse: train to detect patterns of a fixed size, then detect them in a variable sized image. For example, train to detect patterns 10px square, then look for them in an image that could be 500px or 1000px square, without resizing the image, but with the assurance that it is only necessary to look for 10px occurrences of the pattern.

Is there an idiomatic way to do this in PyTorch?

Upvotes: 1

Views: 920

Answers (1)

Roger Trullo
Roger Trullo

Reputation: 1584

Even if you trained your detector with a fixed size image, you can use a different sizes at inference time because everything is convolutional in faster rcnn/yolo architectures. On the other hand, if you only care about 10X10 bounding box detections, you can easily define this as your anchors. I would recomend to you to use the detectron2 framework which is implemented in pytorch and is easily configurable/hackable.

Upvotes: 1

Related Questions