Reputation: 31
When i'm calling this in terminal it works perfectly!
tesseract 1.jpg outPutFileHere -l fra
But i'm trying to make it works with tika
import tika
import sys
from tika import parser
from tika import detector
tikedDocument = parser.from_file(TextImage)
with the same text image i have no results with tika :(
Have you an idea on what's going on?
Thank You
Upvotes: 3
Views: 3284
Reputation:
You need to provide header called "X-Tika-OCRLanguage" for example:
headers = {
"X-Tika-OCRLanguage": "eng+nor"
}
parsed = parser.from_file(path, headers=headers)
Upvotes: 3