Reputation: 3432
I downloaded the pdfminer, the commandline methods work perfectly but I want to be able to convert multiple pdf documents at the same time so I am trying to use the pdfminer as a library, I found this os stackoverflow but I can't get it to work..
from pdfminer.pdfinterp import PDFResourceManager, process_pdf
from pdfminer.converter import TextConverter
from pdfminer.layout import LAParams
from cStringIO import StringIO
def convert_pdf(path):
rsrcmgr = PDFResourceManager()
retstr = StringIO()
codec = 'utf-8'
laparams = LAParams()
device = TextConverter(rsrcmgr, retstr, codec=codec, laparams=laparams)
fp = file(path, 'rb')
process_pdf(rsrcmgr, device, fp)
fp.close()
device.close()
str = retstr.getvalue()
retstr.close()
print str
convert_pdf("/Users/gorkemyurtseven/Desktop/casino.pdf")
when I run it I get:
Traceback (most recent call last):
File "pdfminer.py", line 1, in <module>
from pdfminer.pdfinterp import PDFResourceManager, process_pdf
File "/Users/gorkemyurtseven/Desktop/pdfminer.py", line 1, in <module>
from pdfminer.pdfinterp import PDFResourceManager, process_pdf
ImportError: No module named pdfinterp
Upvotes: 1
Views: 2969
Reputation: 1690
As explained in this post, the problem is that your file is named pdfminer.py
.
Change the name and delete __pycache__/
directory and pdfminer.pyc
file that are created:
$ rm -r __pycache__/ pdfminer.pyc
$ mv pdfminer.py mypdfminer.py
Upvotes: 0
Reputation: 12448
It seems that you are calling your script pdfminer
as the module and it goes crazy when trying to import a module with the same name.
Another reasons may be that pdfminer
module is installed incorrectly or it is not the correct version for your python distribution.
Upvotes: 2