Waimate Wihongi
Waimate Wihongi

Reputation: 21

Is there a python module that reads a pdf and converts it to text

I mean one that is a scanned image or something like that and converts it to text or is there a way to do it

Edit: Btw this isnt meant to be a duplicate i wanna know if i can get text out of a scanned image not a regular PDF

Upvotes: 0

Views: 95

Answers (2)

ish
ish

Reputation: 11

Wrapper for Tesseract OCR is available https://pypi.python.org/pypi/tesserocr

Upvotes: 1

Levi Porter
Levi Porter

Reputation: 19

Try PDFminer, it might suit what you need.

http://www.unixuser.org/~euske/python/pdfminer/index.html

Upvotes: 0

Related Questions