andril gowdhaman
andril gowdhaman

Reputation: 13

How do I view images from pdf in pdfminer3

Here is my code.

from pdfminer3.layout import LAParams
from pdfminer3.pdfpage import PDFPage
from pdfminer3.pdfinterp import PDFResourceManager
from pdfminer3.pdfinterp import PDFPageInterpreter
from pdfminer3.converter import PDFPageAggregator
from pdfminer3.converter import TextConverter
import io

resource_manager = PDFResourceManager()
fake_file_handle = io.StringIO()
converter = TextConverter(resource_manager, fake_file_handle,laparams=LAParams())
page_interpreter = PDFPageInterpreter(resource_manager, converter)

with open('/storage/emulated/0/Download/Rick-Riordan-The-Tyrants-Tomb-The-Trials-of-Apollo-4.pdf','rb') as fh:

    for page in PDFPage.get_pages(fh,
                                  caching=True,
                                  check_extractable=True):
        page_interpreter.process_page(page)

    text = fake_file_handle.getvalue()

# close open handles
converter.close()
fake_file_handle.close()

print(text)

I just want to see the image. My version is python 3. I don't want to have to import another module, so please try to give solution that uses pdfminer3.

Upvotes: 0

Views: 235

Answers (1)

adambogdan1993
adambogdan1993

Reputation: 146

You need to specify in Text Converter where to save images from pdf. Try adding ImageWriter..

pdfResourceManager = PDFResourceManager()
convertedText = StringIO()
layoutParams = LAParams()
imageWriter = ImageWriter('pathToSaveImages/..')
converter = TextConverter(pdfResourceManager, convertedText, codec='utf-8',laparams=layoutParams, imagewriter=imageWriter)

Upvotes: 1

Related Questions