grc
grc

Reputation: 85

read PDF file as text using Python

Error

Traceback (most recent call last): File "C:/Users/XXX/pdf_to_text.py", line 7, in module import slate

File "C:\Python27\lib\site-packages\slate__init__.py", line 48, in from slate import PDF File "C:\Python27\lib\site-packages\slate\slate.py", line 3, in module

from pdfminer.pdfparser import PDFParser, PDFDocument

ImportError: cannot import name PDFDocument

Code:

import slate
with open('C:\Users\XXX\XXX.pdf', 'rb') as f:
pdf_text = slate.PDF(f)
print pdf_text

Can someone advise on how to solve this error?

I will like to read a .PDF file text content using Python.

Upvotes: 1

Views: 987

Answers (1)

shad0w_wa1k3r
shad0w_wa1k3r

Reputation: 13372

You need to install the correct pdfminer version. Seems like the one you have does not define PDFDocument which is why you are getting the ImportError. Check for dependencies in slate and get the right version.

You can check existing version by doing

pip list

Upvotes: 1

Related Questions