Reputation: 71
I am using the Tessdata_Best version of eng.traineddata file for my usecase. I perform further training on the default tessdata_best eng.traineddata, and use the newly generated eng.traineddata file after training process. Tesseract works fine when I test it on PC. But when I test it on Android device, tesseract initialization fails.
Sample code [for tesseract initialization]
[DllImport(TesseractDllName)]
private static extern IntPtr TessBaseAPICreate();
[DllImport(TesseractDllName)]
private static extern int TessBaseAPIInit3(IntPtr handle, string dataPath, string language);
public bool Init (string lang, string dataPath) {
tessHandle = TessBaseAPICreate ();
if (tessHandle.Equals (IntPtr.Zero)) {
Debug.LogError("tessHandle equals IntPtr.Zero, initialization failed..!");
return false;
}
if (TessBaseAPIInit3 (tessHandle, dataPath, lang) != 0) {
Close ();
Debug.LogError("Initialization failed, TessBaseAPIInit3()!=0");
return false;
}
return true;
}
Fails at step "if (TessBaseAPIInit3 (tessHandle, dataPath, lang) != 0)".
Now, there is a solution for this problem as described in this link a_compatible_traineddata_file_version
But, i need to do some further training of tesseract, for which only the tessdata_best version of traineddata files can be used tesseract_best_repo
So, how can we use tessdata_best traineddata file, without issues on an android device?
Alternatively, if above isn't possible, can we somehow train tesseract with a traineddata file, which isn't a tessdata_best version ? currently I get this errror "eng.lstm component is not present" while running
training/combine_tessdata -e tessdata/best/eng.traineddata /tesstutorial/trainplusminus/eng.lstm
[from tesseract_docs]
Also, if i try to override this error, and run tesseract on android inspite of the above error, it causes app crash.
Thanks...
Upvotes: 1
Views: 2313
Reputation: 63
You have to check which version of Tesseract you are using, The tesseract lib on android might be not compatible.
Try loading Tesseract engine to Tesseract only and check if it works like below:
using (var TessarEngine = new TesseractEngine(datapath, "eng", EngineMode.LstmOnly));
Else try loading Tesseract engine to Tesseract only and check if it works like below:
using (var TessarEngine = new TesseractEngine(datapath, "eng", EngineMode.TesseractOnly));
Or If you are not sure , Use both engine modes:
using (var TessarEngine = new TesseractEngine(datapath, "eng", EngineMode.TesseractAndLstm)
This should work for both android & PC for sure if you have set correct datapath for tessdata
folder.
Upvotes: 0
Reputation: 71
The "TessBaseAPIInit3 (tessHandle, dataPath, lang) != 0" can arise if the traineddata is not compatible with the tesseract version we are using. In my case, the eng.traineddata file supported only LSTM (Tesseract version 4.x). Since the tesseract dll for PC was Tessract version 4, it worked on PC, but my android dlls were of Tesseract ver 3.x, so it didn't run.
So, either get a Tessract version 4.x android dll, or use a traineddata file which supports legacy Tesseract version 3.x
In tesseract_4.x git repo, there are 3 different types of traineddata files available.
Here, "tessdata" is both legacy & LSTM compatible, meaning it supports both Tesseract 3 & Tesseract 4. The rest 2 support only Tesseract 4. The traineddata files available in Tesseract 3 branch are only compatible with Tesseract 3.
Upvotes: 1