Reputation: 2295
I want to findout what kind of data we will use to train the new language for Tesseract OCR?
Is it each character? Or we have to make some specific sentences?
Please help to give some source of this information, I can't get clearly on its wiki page.
Upvotes: 0
Views: 2061
Reputation: 309
Try this page. It tells you the steps they took to get it to recognize ancient greek http://www.eutypon.gr/eutypon/pdf/e2012-29/e29-a01.pdf
This is general information from the tesseract team about training tesseract https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3
Upvotes: 1