jejjejd
jejjejd

Reputation: 761

Resizing image and its bounding box

I have an image with bounding box in it, and I want to resize the image.

img = cv2.imread("img.jpg",3)
x_ = img.shape[0]
y_ = img.shape[1]
img = cv2.resize(img,(416,416));

Now I want to calculate the scale factor:

x_scale = ( 416 / x_)
y_scale = ( 416 / y_ )

And draw an image, this is the code for the original bounding box:

( 128, 25, 447, 375 ) = ( xmin,ymin,xmax,ymax)
x = int(np.round(128*x_scale))
y = int(np.round(25*y_scale))
xmax= int(np.round  (447*(x_scale)))
ymax= int(np.round(375*y_scale))

However using this I get:

enter image description here

While the original is:

enter image description here

I don't see any flag in this logic, what's wrong?

Whole code:

imageToPredict = cv2.imread("img.jpg",3)
print(imageToPredict.shape)

x_ = imageToPredict.shape[0]
y_ = imageToPredict.shape[1]

x_scale = 416/x_
y_scale = 416/y_
print(x_scale,y_scale)
img = cv2.resize(imageToPredict,(416,416));
img = np.array(img);


x = int(np.round(128*x_scale))
y = int(np.round(25*y_scale))
xmax= int(np.round  (447*(x_scale)))
ymax= int(np.round(375*y_scale))
Box.drawBox([[1,0, x,y,xmax,ymax]],img)

and drawbox

def drawBox(boxes, image):
    for i in range (0, len(boxes)):
        cv2.rectangle(image,(boxes[i][2],boxes[i][3]),(boxes[i][4],boxes[i][5]),(0,0,120),3)
    cv2.imshow("img",image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

The image and the data for the bounding box are loaded separately. I am drawing the bounding box inside the image. The image does not contain the box itself.

Upvotes: 36

Views: 39591

Answers (5)

Preetom Saha Arko
Preetom Saha Arko

Reputation: 2748

Using imgaug library (thanks to the comment of @Ryaminal)

import imgaug as ia
from imgaug.augmentables.bbs import BoundingBox, BoundingBoxesOnImage
import cv2
import matplotlib.image

size=384  # we want to resize to 384x384
image = cv2.imread('xx.jpg')

# say the bounding box is given in (x,y,w,h) format
x=1493
y=1254
w=805
h=381

bbs = BoundingBoxesOnImage([BoundingBox(x1=x, x2=x+w, y1=y, y2=y+h)], shape=image.shape)

# Rescale image and bounding boxes
image_rescaled = ia.imresize_single_image(image, (size, size))
bbs_rescaled = bbs.on(image_rescaled)

new_x1 = round(bbs_rescaled[0].x1)
new_y1 = round(bbs_rescaled[0].y1)
new_x2 = round(bbs_rescaled[0].x2) - round(bbs_rescaled[0].x1)
new_y2 = round(bbs_rescaled[0].y2) - round(bbs_rescaled[0].y1)

new_w = new_x2 - new_x1
new_h = new_y2 - new_y1

matplotlib.image.imsave('rescaled.jpg', image_rescaled) # if you want to save the resized image

# Draw image before/after rescaling and with rescaled bounding boxes
image_bbs = bbs.draw_on_image(image, size=2)
image_rescaled_bbs = bbs_rescaled.draw_on_image(image_rescaled, size=2)

ia.imshow(image_rescaled_bbs)

Upvotes: 0

Sanjay ch
Sanjay ch

Reputation: 1

I encountered an issue with bounding box coordinates in Angular when using TensorFlow.js and MobileNet-v2 for prediction. The coordinates were based on the resolution of the video frame.

but I was displaying the video on a canvas with a fixed height and width. I resolved the issue by dividing the coordinates by the ratio of the original video resolution to the resolution of the canvas.

      const x = prediction.bbox[0] / (this.Owidth / 300);
      const y = prediction.bbox[1] / (this.Oheight / 300);
      const width = prediction.bbox[2] / (this.Owidth / 300);
      const height = prediction.bbox[3] / (this.Oheight / 300);
      // Draw the bounding box.
      ctx.strokeStyle = '#99ff00';
      ctx.lineWidth = 2;
      ctx.strokeRect(x, y, width, height);
  • this.Owidth & this.Oheight are original resolution of video. it is obtained by.
this.video.addEventListener(
      'loadedmetadata',
      (e: any) => {
        this.Owidth = this.video.videoWidth;
        this.Oheight = this.video.videoHeight;
        console.log(this.Owidth, this.Oheight, ' pixels ');
      },
      false
    );
  • 300 X 300 is my static canvas width and height.

Upvotes: 0

Aniket Maurya
Aniket Maurya

Reputation: 380

Another way of doing this is to use CHITRA

image = Chitra(img_path, box, label)
# Chitra can rescale your bounding box automatically based on the new image size.
image.resize_image_with_bbox((224, 224))

print('rescaled bbox:', image.bounding_boxes)
plt.imshow(image.draw_boxes())

https://chitra.readthedocs.io/en/latest/

pip install chitra

Upvotes: 5

Italo José
Italo José

Reputation: 1686

you can use the resize_dataset_pascalvoc

it's easy to use python3 main.py -p <IMAGES_&_XML_PATH> --output <IMAGES_&_XML> --new_x <NEW_X_SIZE> --new_y <NEW_X_SIZE> --save_box_images <FLAG>"

It resize all your dataset and rewrite new annotations files to resized images

Upvotes: -2

SergGr
SergGr

Reputation: 23788

I believe there are two issues:

  1. You should swap x_ and y_ because shape[0] is actually y-dimension and shape[1] is the x-dimension
  2. You should use the same coordinates on the original and scaled image. On your original image the rectangle is (160, 35) - (555, 470) rather than (128,25) - (447,375) that you use in the code.

If I use the following code:

import cv2
import numpy as np


def drawBox(boxes, image):
    for i in range(0, len(boxes)):
        # changed color and width to make it visible
        cv2.rectangle(image, (boxes[i][2], boxes[i][3]), (boxes[i][4], boxes[i][5]), (255, 0, 0), 1)
    cv2.imshow("img", image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()


def cvTest():
    # imageToPredict = cv2.imread("img.jpg", 3)
    imageToPredict = cv2.imread("49466033\\img.png ", 3)
    print(imageToPredict.shape)

    # Note: flipped comparing to your original code!
    # x_ = imageToPredict.shape[0]
    # y_ = imageToPredict.shape[1]
    y_ = imageToPredict.shape[0]
    x_ = imageToPredict.shape[1]

    targetSize = 416
    x_scale = targetSize / x_
    y_scale = targetSize / y_
    print(x_scale, y_scale)
    img = cv2.resize(imageToPredict, (targetSize, targetSize));
    print(img.shape)
    img = np.array(img);

    # original frame as named values
    (origLeft, origTop, origRight, origBottom) = (160, 35, 555, 470)

    x = int(np.round(origLeft * x_scale))
    y = int(np.round(origTop * y_scale))
    xmax = int(np.round(origRight * x_scale))
    ymax = int(np.round(origBottom * y_scale))
    # Box.drawBox([[1, 0, x, y, xmax, ymax]], img)
    drawBox([[1, 0, x, y, xmax, ymax]], img)


cvTest()

and use your "original" image as "49466033\img.png",

Original image

I get the following image

Processed image

And as you can see my thinner blue line lies exactly inside your original red line and it stays there whatever targetSize you chose (so the scaling actually works correctly).

Upvotes: 31

Related Questions