Sandip Paul
Sandip Paul

Reputation: 1

Error in tesseract_engine_internal, Unable to find training data

I have a problem with (I am using Windows 10) running library(tesseract) which shows Warning message: Unable to find English training data.

I have downloaded "eng.traineddata" from https://github.com/tesseract-ocr/tessdata

While try to run

eng <- tesseract("eng")

It displays an error:

Error in tesseract_engine_internal(datapath, language, configs, opt_names,  : 
  Unable to find training data for: eng. Please consult manual for: ?tesseract_download

Upvotes: 0

Views: 739

Answers (2)

bert-erboul Clement
bert-erboul Clement

Reputation: 3

With R4.1 I had to create the file "C:\Program Files (x86)\Tesseract-OCR" and add to it the eng.traineddata file downloaded from https://github.com/tesseract-ocr/tessdata_best/blob/main/eng.traineddata.

Upvotes: 0

nguyenq
nguyenq

Reputation: 8355

You've probably used legacy, incompatible traineddata file. You'd need either tessdata_fast or tessdata_best data.

https://github.com/tesseract-ocr

Upvotes: 1

Related Questions