Reputation: 561
I tried to draw bounding box of text on a image.The image is perspective-transformed with a given set of coefficients. The coordinates of text before transformation is known, and I want to calculate the coordinates of text after transformation.
To my understanding if I apply perspective transformation with the coefficients used in image transform to the text coordinates, I will get the resulting coordinates of the text after transformation. However, the text does not appear on the place it is supposed to be.
The smaller white box bounds the text well because I know the coordinates of the text.
The smaller white box is not bounding the text because of some error during transforming the coordinates.
I follow the documentation reference for coefficients of perspective transformation and find the coefficients of image transformation using the following code:origin of the code is from this answer
def find_coeffs(pa, pb):
'''
find the coefficients for perspective transform.
parameters:
pa : verticies in the resulting plane
pb : verticies in the current plane
retrun:
coeffs : 8- tuple
coefficents for PIL perspective transform
'''
matrix = []
for p1, p2 in zip(pa, pb):
matrix.append([p1[0], p1[1], 1, 0, 0, 0, -p2[0]*p1[0], -p2[0]*p1[1]])
matrix.append([0, 0, 0, p1[0], p1[1], 1, -p2[1]*p1[0], -p2[1]*p1[1]])
A = np.matrix(matrix, dtype=np.float)
B = np.array(pb).reshape(8)
res = np.dot(np.linalg.inv(A.T * A) * A.T, B)
return np.array(res).reshape(8)
My code for text bounding box transformation:
# perspective transformation
a, b, c, d, e, f, g, h = coeffs
# return two vertices defining the bounding box
new_x0 = float(a * new_x0 - b * new_y0 + c) / float(g * new_x0 + h * new_y0 + 1)
new_y0 = float(d * new_x0 + e * new_y0 + f) / float(g * new_x0 + h * new_y0 + 1)
new_x1 = float(a * new_x1 - b * new_y1 + c) / float(g * new_x1 + h * new_y1 + 1)
new_y1 = float(d * new_x1 + e * new_y1 + f) / float(g * new_x1 + h * new_y1 + 1)
I also went to Pillow Github, but I could not find the source code where perspective transformation is defined.
Some more info about the math of perspective transformation. The Geometry of Perspective Drawing on the Computer
Thanks.
Upvotes: 1
Views: 3534
Reputation: 1162
To compute the new point after a transformation you should get the coefficients from A -> B not from B -> A, which is the standard from PIL library. As example:
# A1, B1 ... are points
# direct transform
coefs = find_coefs([B1, B2, B3, B4], [A1, A2, A3, A4])
# inverse transform
coefs_inv = find_coefs([A1, A2, A3, A4], [B1, B2, B3, B4])
You call the image.transform()
function using the coefs_inv
but calculate the new point using coefs
to get something like this:
img = image.transform(((1500,800)),
method=Image.PERSPECTIVE,
data=coefs_inv)
a, b, c, d, e, f, g, h = coefs
old_p1 = [50, 100]
x,y = old_p1
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p1 = (int(new_x),int(new_y))
old_p2 = [400, 500]
x,y = old_p2
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p2 = (int(new_x),int(new_y))
Full code below:
import os
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
def find_coefs(original_coords, warped_coords):
matrix = []
for p1, p2 in zip(original_coords, warped_coords):
matrix.append([p1[0], p1[1], 1, 0, 0, 0, -p2[0]*p1[0], -p2[0]*p1[1]])
matrix.append([0, 0, 0, p1[0], p1[1], 1, -p2[1]*p1[0], -p2[1]*p1[1]])
A = np.matrix(matrix, dtype=np.float)
B = np.array(warped_coords).reshape(8)
res = np.dot(np.linalg.inv(A.T * A) * A.T, B)
return np.array(res).reshape(8)
coefs = find_coefs(
[(867,652), (1020,580), (1206,666), (1057,757)],
[(700,732), (869,754), (906,916), (712,906)]
)
coefs_inv = find_coefs(
[(700,732), (869,754), (906,916), (712,906)],
[(867,652), (1020,580), (1206,666), (1057,757)]
)
image = Image.open('sample.png')
img = image.transform(((1500,800)),
method=Image.PERSPECTIVE,
data=coefs_inv)
a, b, c, d, e, f, g, h = coefs
old_p1 = [50, 100]
x,y = old_p1
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p1 = (int(new_x),int(new_y))
old_p2 = [400, 500]
x,y = old_p2
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p2 = (int(new_x),int(new_y))
plt.figure()
plt.imshow(image)
plt.scatter([old_p1[0], old_p2[0]],[old_p1[1], old_p2[1]] , s=150, marker='.', c='b')
plt.show()
plt.figure()
plt.imshow(img)
plt.scatter([new_p1[0], new_p2[0]],[new_p1[1], new_p2[1]] , s=150, marker='.', c='r')
plt.show()
Upvotes: 1