Reputation: 13
Here is my code.
from pdfminer3.layout import LAParams
from pdfminer3.pdfpage import PDFPage
from pdfminer3.pdfinterp import PDFResourceManager
from pdfminer3.pdfinterp import PDFPageInterpreter
from pdfminer3.converter import PDFPageAggregator
from pdfminer3.converter import TextConverter
import io
resource_manager = PDFResourceManager()
fake_file_handle = io.StringIO()
converter = TextConverter(resource_manager, fake_file_handle,laparams=LAParams())
page_interpreter = PDFPageInterpreter(resource_manager, converter)
with open('/storage/emulated/0/Download/Rick-Riordan-The-Tyrants-Tomb-The-Trials-of-Apollo-4.pdf','rb') as fh:
for page in PDFPage.get_pages(fh,
caching=True,
check_extractable=True):
page_interpreter.process_page(page)
text = fake_file_handle.getvalue()
# close open handles
converter.close()
fake_file_handle.close()
print(text)
I just want to see the image. My version is python 3. I don't want to have to import another module, so please try to give solution that uses pdfminer3.
Upvotes: 0
Views: 235
Reputation: 146
You need to specify in Text Converter where to save images from pdf. Try adding ImageWriter..
pdfResourceManager = PDFResourceManager()
convertedText = StringIO()
layoutParams = LAParams()
imageWriter = ImageWriter('pathToSaveImages/..')
converter = TextConverter(pdfResourceManager, convertedText, codec='utf-8',laparams=layoutParams, imagewriter=imageWriter)
Upvotes: 1