Reputation: 41
Hy! I am annotating image data through an online plateform which is generating output coordinates like this: bbox":{"top":634,"left":523,"height":103,"width":145} However, i want to use this annotation to train Yolo. So, I have to convert it in yolo format like this: 4 0.838021 0.605556 0.177083 0.237037
In this regard, i need help about how to convert it.
Upvotes: 4
Views: 8942
Reputation: 928
In order to convert a bounding box to yolo format, you'll need the image width and the image height. This is because the yolo format is normalized. Check albumentation documentation for a great explanation.
I developped a light library in python called bboxconverter
which aims at converting bounding box easily from different formats like coco, yolo or pascal voc. You can check my repo on github for more example, explanations, how-to-guide and tutorials.
You could do the following:
#! pip install bboxconverter #python >= 3.8
from bboxconverter.core.bbox import CWH_BBox, TLWH_BBox
IMAGE_WIDTH = 1920
IMAGE_HEIGHT = 1080
# Create a BBox object
tlwh_bbox = TLWH_BBox(x_min=634,
y_min=523,
width=103,
height=145,
image_width=IMAGE_WIDTH,
image_height=IMAGE_HEIGHT,
class_name='',
file_path='')
# Convert to CWH_BBox
cwh_bbox = CWH_BBox.from_TLWH(tlwh_bbox).to_dict()
print(
f"{cwh_bbox['x_center']}, {cwh_bbox['y_center']}, {cwh_bbox['width']}, {cwh_bbox['height']}"
)
For now, the class_name
and file_path
must be specified.
For your information :
If you still need to implement it yourself for your own purposes. There is a great article that demonstrate how to convert bounding box from different format. Here is how you could convert from coco(tlwh) to yolo(cwh).
def coco_to_yolo(x1, y1, w, h, image_w, image_h):
return [((2*x1 + w)/(2*image_w)) , ((2*y1 + h)/(2*image_h)), w/image_w, h/image_h]
Upvotes: 0
Reputation: 1
If you want to convert a python dictionary with the keys top
, left
, widht
, height
into a list in the format [x1
, y1
, x2
, y2
]
Where x1
, y1
are the relative coordinates of the top left corner
of the bounding box and x2
, y2
are the relative coordinates of the bottom right corner
of the bounding box you can use the following function :
def bbox_dict_to_list(bbox_dict, image_size):
h = bbox_dict.get('height')
l = bbox_dict.get('left')
t = bbox_dict.get('top')
w = bbox_dict.get('width')
img_w, img_h = image_size
x1 = l/img_w
y1 = t/img_h
x2 = (l+w)/img_w
y2 = (t+h)/img_h
return [x1, y1, x2, y2]
You must pass as arguments the bbox dictionary, and the image size as a tuple -> (image_width, image_height)
Example
bbox = {"top":634,"left":523,"height":103,"width":145}
bbox_dict_to_list(bbox, (1280, 720))
>> [0.40859375, 0.8805555555, 0.521875, 1.02361111111]
You can change the return order to suit your needs
Upvotes: 0
Reputation: 6333
Here, For the size you need to pass the (w,h) and the for the box you need to pass (x,x+w, y, y+h) https://github.com/ivder/LabelMeYoloConverter/blob/master/convert.py
def convert(size, box):
dw = 1./size[0]
dh = 1./size[1]
x = (box[0] + box[1])/2.0
y = (box[2] + box[3])/2.0
w = box[1] - box[0]
h = box[3] - box[2]
x = x*dw
w = w*dw
y = y*dh
h = h*dh
return (x,y,w,h)
Alternatively, you can use below
def convert(x,y,w,h):
dw = 1.0/w
dh = 1.0/h
x = (2*x+w)/2.0
y = (2*y+w)/2.0
x = x*dw
y = y*dh
w = w*dw
h = h*dh
return (x,y,w,h)
Each grid cell predicts B bounding boxes as well as C class probabilities. The bounding box prediction has 5 components: (x, y, w, h, confidence). The (x, y) coordinates represent the center of the box, relative to the grid cell location (remember that, if the center of the box does not fall inside the grid cell, than this cell is not responsible for it). These coordinates are normalized to fall between 0 and 1. The (w, h) box dimensions are also normalized to [0, 1], relative to the image size. Let’s look at an example:
What does the coordinate output of yolo algorithm represent?
Upvotes: 5