Chiro Odhora
Chiro Odhora

Reputation: 77

Which algorithm is used in google's tesseract-OCR for Recognition?

Which algorithm is used in google's tesseract-OCR for Recognition?Is it Neural Network?

Upvotes: 6

Views: 6839

Answers (2)

WY Hsu
WY Hsu

Reputation: 1905

Based on the About part of tesseract github repo:

Tesseract 4 adds a new neural net (LSTM) based OCR engine which is focused on line recognition

The algorithm is using LSTM model to extract the text.

For more information, you can see Modernization Efforts of page How Tesseract uses LSTMs...

So, yes, it is based on the neural network.

Upvotes: 1

msanford
msanford

Reputation: 12227

This paper in the tesseract source provides a deep overview of the technology.

Notably:

Blobs are organized into text lines, and the lines and regions are analyzed for fixed pitch or proportional text.

[...]

Recognition then proceeds as a two-pass process. In the first pass, an attempt is made to recognize each word in turn. Each word that is satisfactory is passed to an adaptive classifier as training data. The adaptive classifier then gets a chance to more accurately recognize text lower down the page.

[...]

Once the text lines have been found, the baselines are fitted more precisely using a quadratic spline.

[...]

The baselines are fitted by partitioning the blobs into groups with a reasonably continuous displacement for the original straight baseline. A quadratic spline is fitted to the most populous partition, (assumed to be the baseline) by a least squares fit.

The paper does not explicitly state whether it uses a neural network, but given the content I would say it's likely, at least for parts of it.

For more on line-finding, see R. Smith, “A Simple and Efficient Skew Detection Algorithm via Text Row Accumulation”, Proc. of the 3rd Int. Conf. on Document Analysis and Recognition (Vol. 2), IEEE 1995, pp. 1145-1148.

Upvotes: 6

Related Questions