tesseract 2.x - using multiple fonts at the same time

Question

I have succesfully trained tesseract 2.x to recognize a few specific fonts. However, it seems that I can't make tesseract to recognize all of those fonts at the same time - i.e. source image contains all of them. Currently, only one set of tesseract data can be put into tessdata folder (i.e. one set with one trained font).

I know that tesseract 3.x handles correctly multiple fonts - however, I can't upgrade, since there's no decent binding to .NET, that has same features as .NET binding of version 2.x.

Also, I would like to avoid doing all the preprocessing and OCR itself several times, for each font.

nguyenq · Accepted Answer

For Tesseract 2.0x, a language data pack can recognize multiple fonts. Did you cluster your training files?

There are a couple excellent .NET wrapper for Tesseract 3.01. Check its AddOn page for more info.

tesseract 2.x - using multiple fonts at the same time

Answers (1)

Related Questions