Kershaw
Kershaw

Reputation: 561

Image perspective transform using Pillow

I tried to draw bounding box of text on a image.The image is perspective-transformed with a given set of coefficients. The coordinates of text before transformation is known, and I want to calculate the coordinates of text after transformation.

To my understanding if I apply perspective transformation with the coefficients used in image transform to the text coordinates, I will get the resulting coordinates of the text after transformation. However, the text does not appear on the place it is supposed to be.

See the following graphs graph before transformation

The smaller white box bounds the text well because I know the coordinates of the text.

graph after transformation

The smaller white box is not bounding the text because of some error during transforming the coordinates.

I follow the documentation reference for coefficients of perspective transformation and find the coefficients of image transformation using the following code:origin of the code is from this answer

def find_coeffs(pa, pb):
    '''
    find the coefficients for perspective transform. 

    parameters:
        pa : verticies in the resulting plane
        pb : verticies in the current plane

    retrun:
        coeffs : 8- tuple
          coefficents for PIL perspective transform
    '''
    matrix = []
    for p1, p2 in zip(pa, pb):
        matrix.append([p1[0], p1[1], 1, 0, 0, 0, -p2[0]*p1[0], -p2[0]*p1[1]])
        matrix.append([0, 0, 0, p1[0], p1[1], 1, -p2[1]*p1[0], -p2[1]*p1[1]])

    A = np.matrix(matrix, dtype=np.float)
    B = np.array(pb).reshape(8)
    res = np.dot(np.linalg.inv(A.T * A) * A.T, B)
    return np.array(res).reshape(8)

My code for text bounding box transformation:

    # perspective transformation
    a, b, c, d, e, f, g, h = coeffs
    # return two vertices defining the bounding box

    new_x0 = float(a * new_x0 - b * new_y0 + c) / float(g * new_x0 + h * new_y0 + 1)
    new_y0 = float(d * new_x0 + e * new_y0 + f) / float(g * new_x0 + h * new_y0 + 1)
    new_x1 = float(a * new_x1 - b * new_y1 + c) / float(g * new_x1 + h * new_y1 + 1)
    new_y1 = float(d * new_x1 + e * new_y1 + f) / float(g * new_x1 + h * new_y1 + 1) 

I also went to Pillow Github, but I could not find the source code where perspective transformation is defined.

Some more info about the math of perspective transformation. The Geometry of Perspective Drawing on the Computer

Thanks.

Upvotes: 1

Views: 3534

Answers (1)

Leonardo Mariga
Leonardo Mariga

Reputation: 1162

To compute the new point after a transformation you should get the coefficients from A -> B not from B -> A, which is the standard from PIL library. As example:

# A1, B1 ... are points
# direct transform
coefs = find_coefs([B1, B2, B3, B4], [A1, A2, A3, A4])

# inverse transform
coefs_inv = find_coefs([A1, A2, A3, A4], [B1, B2, B3, B4])

You call the image.transform() function using the coefs_inv but calculate the new point using coefs to get something like this:

img = image.transform(((1500,800)),
                      method=Image.PERSPECTIVE,
                      data=coefs_inv)

a, b, c, d, e, f, g, h = coefs

old_p1 = [50, 100]
x,y = old_p1
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p1 = (int(new_x),int(new_y))

old_p2 = [400, 500]
x,y = old_p2
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p2 = (int(new_x),int(new_y))

PILperspective

Full code below:

import os
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt


def find_coefs(original_coords, warped_coords):
        matrix = []
        for p1, p2 in zip(original_coords, warped_coords):
            matrix.append([p1[0], p1[1], 1, 0, 0, 0, -p2[0]*p1[0], -p2[0]*p1[1]])
            matrix.append([0, 0, 0, p1[0], p1[1], 1, -p2[1]*p1[0], -p2[1]*p1[1]])

        A = np.matrix(matrix, dtype=np.float)
        B = np.array(warped_coords).reshape(8)

        res = np.dot(np.linalg.inv(A.T * A) * A.T, B)
        return np.array(res).reshape(8)


coefs = find_coefs(
                  [(867,652), (1020,580), (1206,666), (1057,757)],
                  [(700,732), (869,754), (906,916), (712,906)]
                  )

coefs_inv = find_coefs(
                  [(700,732), (869,754), (906,916), (712,906)],
                  [(867,652), (1020,580), (1206,666), (1057,757)]
                  )

image = Image.open('sample.png')

img = image.transform(((1500,800)),
                      method=Image.PERSPECTIVE,
                      data=coefs_inv)

a, b, c, d, e, f, g, h = coefs

old_p1 = [50, 100]
x,y = old_p1
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p1 = (int(new_x),int(new_y))

old_p2 = [400, 500]
x,y = old_p2
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p2 = (int(new_x),int(new_y))



plt.figure()
plt.imshow(image)
plt.scatter([old_p1[0], old_p2[0]],[old_p1[1], old_p2[1]]  , s=150, marker='.', c='b')
plt.show()


plt.figure()
plt.imshow(img)
plt.scatter([new_p1[0], new_p2[0]],[new_p1[1], new_p2[1]]  , s=150, marker='.', c='r')

plt.show()

Upvotes: 1

Related Questions