How to integrate tesseract-ocr with tika?

Question

I need to integrate the tesseract-ocr which converts scanned image as pdf to text.

There is tesseractOCRParser already available.

But there is no invoke method given.

When I am trying to build tika with tesseract-ocr referral path I am getting the following error

Results:

Failed tests:   
testNoConfig(org.apache.tika.parser.ocr.TesseractOCRConfigTest): 
Invalid default tesseractPath value expected:<[]> but was: 
<[/home/serendio/tesseract-ocr/]>

Tests run: 569, Failures: 1, Errors: 0, Skipped: 7

Can anyone help me out ???

Or any other-way to resolve this problem??

How to integrate tesseract-ocr with tika?

Answers (1)

Related Questions