Tesseract OCR How to train Arabic digits numbers?

Question

I'm training custom dataset of Arabic numbers I'm extracted them from IDs images at Tesseract FineTuning with ara.traineddata

with this command

make training MODEL_NAME=ara_handwritten_digits START_MODEL=ara TESSDATA=../tessdata/ MAX_ITERATIONS=200000 LEARNING_RATE=0.001

my dataset has 10k images splited into images.tif and labels.gt.txt

example: 24609052400134.gt.txt 24609052400134.tif

and I got the result at git bash

Finished! Selected model with minimal training error rate (BCER) = 0.007

but when I test the dataset, the result is very, very bad.

is anyone has any suggestions?

Answers (0)