Anil Yadav
Anil Yadav

Reputation: 149

Rotating 2D grayscale image with transformation matrix

I am new to image processing so i am really confused regarding the coordinate system with images. I have a sample image and i am trying to rotate it 45 clockwise. My transformation matrix is T = [ [cos45 sin45] [-sin45 cos45] ]

Here is the code:

import numpy as np
from matplotlib import pyplot as plt
from skimage import io

image = io.imread('sample_image')
img_transformed = np.zeros((image.shape), dtype=np.uint8)

trans_matrix = np.array([[np.cos(45), np.sin(45)], [-np.sin(45), np.cos(45)]])

for i, row in enumerate(image):
    for j,col in enumerate(row):
        pixel_data = image[i,j] #get the value of pixel at corresponding location
        input_coord = np.array([i, j]) #this will be my [x,y] matrix
        result = trans_matrix @ input_coord 
        i_out, j_out = result #store the resulting coordinate location

        #make sure the the i and j values remain within the index range
        if (0 < int(i_out) < image.shape[0]) and (0 < int(j_out) < image.shape[1]):
            img_transformed[int(i_out)][int(j_out)] = pixel_data

plt.imshow(img_transformed, cmap='gray')

The image comes out distorted and doesn't seems right. I know that in pixel coordinate, the origin is at the top left corner (row, column). is the rotation happening with respect to origin from the top left corner? is there a way to shift origin to center or any other given point?

Thank you all!

Upvotes: 0

Views: 2324

Answers (1)

Juan
Juan

Reputation: 5738

Yes, as you suspect, the rotation is happening with respect to the top left corner, which has coordinates (0, 0). (Also: the NumPy trigonometric functions use radians rather than degrees, so you need to convert your angle.) To compute a rotation with respect to the center, you do a little hack: you compute the transformation for moving the image so that it is centered on (0, 0), then you rotate it, then you move the result back. You need to combine these transformations in a sequence because if you do it one after the other, you'll lose everything in negative coordinates.

It's much, much easier to do this using Homogeneous coordinates, which add an extra "dummy" dimension to your image. Here's what your code would look like in homogeneous coordinates:

import numpy as np
from matplotlib import pyplot as plt
from skimage import io

image = io.imread('sample_image')
img_transformed = np.zeros((image.shape), dtype=np.uint8)

c, s = np.cos(np.radians(45)), np.sin(np.radians(45))
rot_matrix = np.array([[c, s, 0], [-s, c, 0], [0, 0, 1]])

x, y = np.array(image.shape) // 2
# move center to (0, 0)
translate1 = np.array([[1, 0, -x], [0, 1, -y], [0, 0, 1]])
# move center back to (x, y)
translate2 = np.array([[1, 0, x], [0, 1, y], [0, 0, 1]])

# compose all three transformations together
trans_matrix = translate2 @ rot_matrix @ translate1

for i, row in enumerate(image):
    for j,col in enumerate(row):
        pixel_data = image[i,j] #get the value of pixel at corresponding location
        input_coord = np.array([i, j, 1]) #this will be my [x,y] matrix
        result = trans_matrix @ input_coord 
        i_out, j_out, _ = result #store the resulting coordinate location

        #make sure the the i and j values remain within the index range
        if (0 < int(i_out) < image.shape[0]) and (0 < int(j_out) < image.shape[1]):
            img_transformed[int(i_out)][int(j_out)] = pixel_data

plt.imshow(img_transformed, cmap='gray')

The above should work ok, but you will probably get some black spots due to aliasing. What can happen is that no coordinates i, j from the input land exactly on an output pixel, so that pixel never gets updated. Instead, what you need to do is iterate over the pixels of the output image, then use the inverse transform to find which pixel in the input image maps closest to that output pixel. Something like:

inverse_tform = np.linalg.inv(trans_matrix)

for i, j in np.ndindex(img_transformed.shape):
    i_orig, j_orig, _ = np.round(inverse_tform @ [i, j, 1]).astype(int)
    if i_orig in range(image.shape[0]) and j_orig in range(image.shape[1]):
        img_transformed[i, j] = image[i_orig, j_orig]

Hope this helps!

Upvotes: 2

Related Questions