Martin Thoma
Martin Thoma

Reputation: 136277

How can I find the rotation of a document?

Take the following scanned image as an example:

enter image description here

By looking at the borders, you can clearly see that it is rotated slightly to the left. How can I detect the amount of this rotation? (In order to fix it)

Edge detection would "highlight" the border, but except for trying many rotations and building an x- / y- histogram I don't know how to use this information. And the iterative approach seems to be more computationally intensive than appropriate for such a simple problem.

I am looking for Pseudo-code / an algorithmic idea. Hence I didn't tag this question with a programming language. However, if you like to give code, I prefer Python.

Upvotes: 1

Views: 323

Answers (2)

Martin Thoma
Martin Thoma

Reputation: 136277

The deskew package works pretty well for normal text documents. It doesn't work well for images and also not for the given example.

Upvotes: 0

user3146587
user3146587

Reputation: 4320

What about the following approach (assuming the amount of rotation is small and there are indeed enough horizontal / vertical line features in the document):

  1. Extract all (parametric) lines in the document via a Hough transform,
  2. Classify lines as either close to horizontal or vertical (and simply discard the ones that can't be classified up to some tolerance),
  3. Robustly fit a rotation to minimize the deviation of the lines from their expected orientation (RANSAC variant with a rotation solver).

For 3., one could start without any RANSAC (just fit a best rotation to all the horizontal / vertical lines) and only add it if there are noticeable blunders that need to be taken care of.

Regarding fitting a rotation to lines, each line can be parameterized with a unit 2D vector n and a scalar d s.t. a 2D point M belongs to the line iff n . M + d = 0. The vector n gives the orientation of the line and based on the classification is expected to be close to a reference vector n_0 (e_x, -e_x, e_y or -e_y depending on the actual classification). So a possible objective function would be F(theta) = argmax_theta 1/2 sum_i | ( R(theta) n^i ) . n_0^i |^2 where theta is the angle of the 2D rotation to apply, R(theta) the corresponding 2 x 2 rotation matrix. This objective function finds the rotation that maximizes the alignment between the rotated line vectors and the expected line vectors. If the square is dropped from the objective function, then the objective function actually boils down to a simplified Kabsch algorithm / absolute orientation in 2D without translation / scale which can be solved with an SVD.

Upvotes: 3

Related Questions