I have 2D image data that I'd like to represent as a plane in 3D, and perform various manipulations (translate, rotate, magnify). I'd like to obtain the cartesian components and color value of the pixels, after such operations. So, what is an efficient way to: represent row/col values of an image as cartesian values transform those cartesian values as described above I'm sure there are libraries that will do most of the heavy lifting (np.linalg?) but I just don't know which ones are where I should start. Thanks.

Reputation: 11181

Transform image data in 3D

I have 2D image data that I'd like to represent as a plane in 3D, and perform various manipulations (translate, rotate, magnify). I'd like to obtain the cartesian components and color value of the pixels, after such operations.

So, what is an efficient way to:

represent row/col values of an image as cartesian values
transform those cartesian values as described above

I'm sure there are libraries that will do most of the heavy lifting (np.linalg?) but I just don't know which ones are where I should start. Thanks.

Upvotes: 1

Answers (1)

askewchan

Reputation: 46578

You can use scipy for such things. In particular, the scipy.ndimage module can do translations, rotations, and magnification, among other transformations and mappings. These operations use interpolation when necessary to fit into the grid of rectangular arrays.

If you want to work directly on the coordinates of pixels without interpolation, an image library may not work. You can grab the coordinates of the array with np.indices, and run them through any transform you'd like, and the original will associate with the original pixel value. Unfortunately these transformations don't seem to be implemented in a common library, so you have to search for functions, e.g., Python - Rotation of 3D vector.

An example with the rotation from the linked answer:

a = np.arange(12).reshape(3, 4, 1) # a 2D image in 3D (hence the extra dim of 1)
i, j, k = np.indices(a.shape)
x, y, z = np.meshgrid(np.linspace(0, 1, 4), np.linspace(0, 1, 3), [.5], indexing='xy')

axis = [0, 0, 1]
theta = 90
#M = rotation_matrix(axis, theta)
# for example, rotate around z-axis:
M = np.array([[ 0., -1.,  0.],
              [ 1.,  0.,  0.],
              [ 0.,  0.,  1.]])
# the following two lines are equivalent ways of multiplying M by each point as a vector:
# we want to sum over last axis of M, first of [x, y z]
xp, yp, zp = np.einsum('ij,jklm->iklm' M, [x, y, z])
xp, yp, zp = np.tensordot(M, [x, y, z], axes=(-1,0))

So now, the point that was originally at, say, i, j, k = 2, 2, 0, went from:

x[2, 2, 0], y[2, 2, 0], z[2, 2, 0]
# (0.666666, 1.0, 0)

xp[2, 2, 0], yp[2, 2, 0], zp[2, 2, 0]
#(-1.0, 0.666666, 0.0)

And still has the color:

a[2, 2, 0]
# 10

You can see all the coordinate with the same shape as a just by looking at xp, yp, zp.

If your images are color, be careful that your 2D image is already 3D with an extra axis for color. Include this when using indices or meshgrid, and if you use einsum or tensordot.

Upvotes: 2

Transform image data in 3D

Answers (1)

Related Questions