Reputation: 101
i've created over 1200 images with labels for yolo detection and the problem is every image size is 800x600 and all the objects with labels are in the middle of the image. so i wanna crop the rest of the part since objects are placed in the middle. so the size of images would be something like 400x300 (crop left, right, top, bottom equally) but the objects will still be in the middle. but how do you convert or change the coordinates other than labeling all over again?
# (used labelimg for yolo)
0 0.545000 0.722500 0.042500 0.091667
1 0.518750 0.762500 0.097500 0.271667
heres one of my label .txt. sorry for my bad english!
Upvotes: 2
Views: 1287
Reputation: 316
I was just working this out myself, so here is a complete explanation of why the formula at the bottom is correct.
let's go over how these Annotations are formatted.
x
0--------------->1
| .
| _________
| | . | ^
| | . | |
y|...|...* | h
| | | |
| |_______| v
| <---w--->
V
1
Each line is 5 numbers sperated by a space: n x y w h
with
W and H mean the width and height of the original image. A normalized value is relative to the width or height of the image.Not in pixels or other unit. It a proportion. For example the x value is normalized like this x[px]/W[px] = x normalized.
a few advantages of this:
The y axes goes from top to bottom. everything else is like your standard coordinate system.
Now to cropping. let's take this picture of a tree:
W
0------>1
|⠀⢀⣴⣶⣤⣄⠀|
|⢠⣿⣿⣿⣿⣿⡆|
H |⠈⠿⠿⣯⠿⠿⠁|
| ⠀⠀⣿⠀ |⠀⠀
v ⠐⠛⠃⠀ |⠀
1--------
We will now crop to the top left quarter of the tree image.
_____
| ⣴⣶|
|⢠⣿⣿|
-----
our new image width W' is now only half of the original W. also H'= 0.5*H. The center of the old image is now the bottom left corner. We know the center of the image p is at (0.5,0.5). The bottom left corner is at p' =(1,1). If we would crop so (0.3,0.3) in the old image is the new bottom richt the new coordinate would also be at (1,1). 0.5 is also ½ . To get from 0.5 to 1 we need to multiply by 2, for ⅓ *3 , ¼ *4 . We see that if we reduce the the width or height by a/b be need to multiply by b/a.
But we also want to move the top left of the image, our coordinate origin O. Lets crop to the tree trunk:
O'---
H' |⠀⣿⠀|⠀⠀
|⠐⠛⠃|
----q'
W'
W is 7 characters. the new width is W' is 3. H=5 and H' is 2. The new origin O is (0,0) of course and O' is at (2,3) in characters, normalized to the original image ([![2 over 7][2]][2], [![3 over 5][3]][3]) or (0.285,0.6). O' is (0.285,0.6) but should be (0,0) so we reduce by x and y by 0.285 and 0.6 respectively before we scale the new value. This is not very interesting because 0 times anything is 0.
Let's do another example. the bottom right of our new cropped image of the tree trunk. Let's call this point q we know that q in our new system of the cropped image must be q' =(1,1) , it's the bottom right after all.
We already measured:
W=7 W'=3 H=5 H'=2
By how much did we reduce height and width as a proportion?
(W-W'/W) is (7-3/7) is (4/7) or 0.571 . We know we have to scale W by 7/4 or 1.75 or 0.571^-1 . For H : 3/5 -> 5/3 -> 1.6 repeating. lets call these scaling factors s_h =5/3 and s_w=7/4
q' is at (5,7) in O . lets put our formula to the test. we moved hour origin by 2 in x/w and 3 in y/h direction lets call this Δw=2 and Δh=3.
For q'_x we remove 2 from q_x because Δw=2. we get 5-2=3. now we normalize 3 by dividing by 5. so we get q_x is 3/5. now we scale by s_h= 5/3 and yes 5/3 times 3/5 is indeed 1. Now that we checked our logic we can write an algorithm.
We already have normalized values so the matter is simpler.
For a point p in the original we can calculate p' in the new image like this:
p`= (x',y')=((x -Δw)* s_w),(y -Δh)* s_h) with: Δw = abs(W-W'),Δh = abs(H-H') , s_w= W/Δw , s_h= H/Δh h'= h * s_h w'= w * s_w
in python:
def transpose_annot(x_c, y_c, w_c,h_c,annnotations):
# c : cropped area
# s_w scale width
s_w = 1/w_c
# s_w scale height
s_h = 1/h_c
new_annots=list()
for annot in annnotations:
try:
n,x, y, w, h = annot # check if n/label is given
except Exception:
x, y, w, h = annot
w_ = w*s_w
h_ = h*s_h
delta_x= x-x_c
delta_y=y-y_c
# center of cropping area is new center of image
# we just scale the image accordingly
x_ = 0.5 + delta_x * s_w
y_ = 0.5 + delta_y * s_h
if n==None:
new_annots.append((x_, y_, w_, h_))
else:
new_annots.append((n,x_, y_, w_, h_))
print(x_, y_, w_, h_)
return new_annots
We could crop out annotations that we need to drop, or adjust to being partially cropped out.
As mentioned before all values must be in the interval [0,1].
Completely cropped out annotations will have 1+Δw/2>x<Δw/2 and 1+Δw/2>y<Δh/2
if you want to include annotations with only 1/4 or less area visible or drop annotations in the range [0,25,1) it will be more complicated.
x
_________
| . |
| . |
y...|.0-*---|-------->1
| | | h
|_______|
| w
V
1
we can view this problem as calculating the intersection area between two rectangles. For convenience the function also returns the percentage of area in frame.
def new_annotation_area(x, y, w, h):
# ________
# | a |
# | ___|______
# | |c | |
# |___|__| b |
# |________|
# a is coordinate system (given)
# b is the annotation in coordinate system
# c is the intersection area
a_x = 0.5
a_y = 0.5
a_w = 1
a_h = 1
a_max_x = a_x + a_w / 2
a_min_x = a_x - a_w / 2
b_max_x = x + w / 2
b_min_x = x - w / 2
# from the one dimensional case
# how much do two lines overlap/intersect?
# it is easy to get to the area
# a_min_x----------a_max_X
# b_min_X----------b_max_x
# c_min_x----c_max_x
c_min_x = max(a_min_x, b_min_x)
c_max_x = min(a_max_x, b_max_x)
c_len_x = c_max_x - c_min_x
a_max_y = a_y + a_h / 2
a_min_y = a_y - a_h / 2
b_max_y = y + h / 2
b_min_y = y - h / 2
c_min_y = max(a_min_y, b_min_y)
c_max_y = min(a_max_y, b_max_y)
c_len_y = c_max_y - c_min_y
area = c_len_y * c_len_x
c_w = c_len_x
c_h = c_len_y
c_x = c_min_x + 0.5 * c_w
c_y = c_min_y + 0.5 * c_h
return area/(w*h), (c_x, c_y, c_w, c_h)
Upvotes: 2