mnn
mnn

Reputation: 2010

tesseract 2.x - using multiple fonts at the same time

I have succesfully trained tesseract 2.x to recognize a few specific fonts. However, it seems that I can't make tesseract to recognize all of those fonts at the same time - i.e. source image contains all of them. Currently, only one set of tesseract data can be put into tessdata folder (i.e. one set with one trained font).

I know that tesseract 3.x handles correctly multiple fonts - however, I can't upgrade, since there's no decent binding to .NET, that has same features as .NET binding of version 2.x.

Also, I would like to avoid doing all the preprocessing and OCR itself several times, for each font.

Upvotes: 1

Views: 1418

Answers (1)

nguyenq
nguyenq

Reputation: 8355

For Tesseract 2.0x, a language data pack can recognize multiple fonts. Did you cluster your training files?

There are a couple excellent .NET wrapper for Tesseract 3.01. Check its AddOn page for more info.

Upvotes: 2

Related Questions