simbr
simbr

Reputation: 75

Why does pyinstaller create a huge exe file when using it with pdfminer modules in Python 3.6?

I try to create an exe file with pyinstaller (in Python 3.6) from a script that is using pdfminer modules but the created exe file is huge, around 240 MB. In contrast, when using pyinstaller in Python 2.7 with a similar script the created exe file is only around 10 MB.

What is it that I am doing wrong?

I create the exe file with the following command: pyinstaller.exe --onefile {filename/path}

My code:

from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer.converter import TextConverter
from pdfminer.layout import LAParams
from pdfminer.pdfpage import PDFPage

...

def convert_pdf_to_txt(path):
    rsrcmgr = PDFResourceManager()
    retstr = io.StringIO()
    #codec = 'windows-1250'
    laparams = LAParams()
    device = TextConverter(rsrcmgr, retstr, laparams=laparams)
    fp = open(path, 'rb')
    # reply = s.get(path, stream=True, verify= False)
    # fp = StringIO()
    # fp.write(reply.content)
    # fp.seek(0)
    interpreter = PDFPageInterpreter(rsrcmgr, device)
    password = ""
    maxpages = 0
    caching = True
    pagenos = set()

    for page in PDFPage.get_pages(fp, pagenos, maxpages=maxpages, password=password, caching=caching, check_extractable=True):
        interpreter.process_page(page)

    text = retstr.getvalue()

    fp.close()
    device.close()
    retstr.close()
    return text

...

Upvotes: 0

Views: 145

Answers (1)

simbr
simbr

Reputation: 75

I found the following solution:

  1. Created a virtual environment
  2. Via pip installed only the packages needed (i.e. pdfminer and pyinstaller).

The result in now an exe file that is only 7.8 MB.

Upvotes: 1

Related Questions