Extraction text from image. OCR

I'm using Pytesseract, and it's working when i use English,but when i swith to russian language. I have problem like this:

"TypeError: 'str' does not support the buffer interface". I've tried other language it also doesn't work.

It's my code:

from PIL import Image
from pytesseract import image_to_string
k=image_to_string(Image.open("ff.jpg"), lang="rus")
print(image_to_string(Image.open("picture.jpg"), lang="rus"))

Can someone help me to solve this problem?

Upvotes: 2

Views: 1545

Answers (2)

SVJ
SVJ

Reputation: 1

Please put the training data file(rus.traineddata) for the required language in the tessdata folder of tesseract installation.

Upvotes: 0

Ajin A K
Ajin A K

Reputation: 11

you need to training data for Tesseract for specific language You need to copy language supporting file in your system For reference you can visit the site :- https://github.com/tesseract-ocr/langdata

Upvotes: 1

Related Questions