wittn
wittn

Reputation: 310

'OSError: [Errno 22] Invalid argument' when merging files using PyPDF2

I am simply trying to merge some PDF files using python, more specifically PyPDF2. Easy enough, but for some reason I get an error, which simply do not understand.

While searching for a solution, I found, that other people had this problem as well. However, no for me satisfying solution was posted.

My code for merging the files:

from PyPDF2 import PdfFileMerger

def merge(self, work_files, destination_file):
    pdf_merger = PdfFileMerger()

    for pdf in work_files:
        pdf_merger.append(pdf)
        #also tried the following with the same results:
        #with open(pdf, 'wb') as fileobj:
            #merger.append(fileobj)

    with open(destination_file, 'wb') as fileobj:
      pdf_merger.write(fileobj)

whereas work_files is a list of of paths to the pdfs to merge and destination_file is the file the merged pdf is supposed to be saved.

This produces the following error (full stacktrace provided as requested for):

Traceback (most recent call last):
      File "main.py", line 9, in <module>
         merger.append(fileobj)
      File "/home/user/.local/lib/python3.8/sitepackages/PyPDF2/merger.py",line 203, 
      in append
         self.merge(len(self.pages), fileobj, bookmark, pages, 
      import_bookmarks)
      File "/home/user/.local/lib/python3.8/site- 
      packages/PyPDF2/merger.py",
      line 133, in merge
         pdfr = PdfFileReader(fileobj, strict=self.strict)
      File "/home/user/.local/lib/python3.8/site- 
      packages/PyPDF2/pdf.py", line 1084, 
      in __init__
         self.read(stream)
      File "/home/user/.local/lib/python3.8/site 
      packages/PyPDF2/pdf.py", line 1689, 
      in read
         stream.seek(-1, 2)
    OSError: [Errno 22] Invalid argument

I have tried different ways of inputting the paths, I have tried relative paths, absolute paths as well as parsing them into another file, without any success.

I am using python 3.8 and working with Linux Ubuntu 20.04.

I would be thankful for any help.

Upvotes: 0

Views: 1265

Answers (2)

wittn
wittn

Reputation: 310

After trying out other ways to merge the PDF files, I embarrassingly realized, that my test file were actually damaged file, which could not even be read by the system - problem solved.

Upvotes: 0

CtrlMj
CtrlMj

Reputation: 119

If work_files is only a list of paths it means you are only passings strings as input to the append method, one at a time. According to the PdfFileMerger documentation, you need to pass file objects as input to the append method.

fileobj – A File Object or an object that supports the standard read and seek methods similar to a File Object. Could also be a string representing a path to a PDF file

Sorry, I overlooked the last part of the documentation but have you actually tried passing file objects? Also maybe try getting your files names with the glob.glob(*.pdf) method. If you could post the full stack trace of the error it would be helpful too.

Upvotes: 0

Related Questions