Simon
Simon

Reputation: 753

pytorch affine_grid: what is the theta input?

When trying to use torch.nn.functional.affine_grid, it requires a theta affine matrix of size (N x 3 x 4) according to the documentation. I thought a general affine matrix is (N x 4 x 4). What is the supposed affine matrix format in pytorch?

An example of 3D rotation affine input would be ideal. Appreciate your help.

Upvotes: 5

Views: 3991

Answers (1)

Shai
Shai

Reputation: 114796

The dimensions you mention are applicable for the case of 3D inputs, that is you wish to apply 3D geometric transforms on the input tensor x of shape bxcxdxhxw.
A transformation to points in 3D (represented as 4-vector in homogeneous coordinates as (x, y, z, 1)) should be, in the general case, a 4x4 matrix as you noted.
However, since we restrict ourselves to homogeneous coordinates, i.e., the fourth coordinate must be 1, the 4th row of the matrix must be (0, 0, 0, 1) (see this).
Therefore, there's no need to explicitly code this last row.

To conclude, a 3D transformation composed of a 3x3 rotation R and 3d translation t is simply the 3x4 matrix:

theta = [R t]

Upvotes: 3

Related Questions