AlliDeacon
AlliDeacon

Reputation: 1495

PyPDF2 IOError: [Errno 22] Invalid argument on PyPdfFileReader Python 2.7

Goal = Open file, encrypt file, write encrypted file.
Trying to use the PyPDF2 module to accomplish this. I have verified theat "input" is a file type object. I have researched this error and it translates to "file not found". I believe that it is linked somehow to the file/file path but am unsure how to debug or troubleshoot. and getting the following error:

Traceback (most recent call last):
  File "CommissionSecurity.py", line 52, in <module>
    inputStream = PyPDF2.PdfFileReader(input)
  File "build\bdist.win-amd64\egg\PyPDF2\pdf.py", line 1065, in __init__
  File "build\bdist.win-amd64\egg\PyPDF2\pdf.py", line 1660, in read
IOError: [Errno 22] Invalid argument

Below is the relevant code. I'm not sure how to correct this issue because I'm not really sure what the issue is. Any guidance is appreciated.

for ID in FileDict:
        if ID in EmailDict : 
            path = "C:\\Apps\\CorVu\\DATA\\Reports\\AlliD\\Monthly Commission Reports\\Output\\pdcom1\\"
            #print os.listdir(path)
            file = os.path.join(path + FileDict[ID])

            with open(file, 'rb') as input:
                print type(input)
                inputStream = PyPDF2.PdfFileReader(input)
                output = PyPDF2.PdfFileWriter()
                output = inputStream.encrypt(EmailDict[ID][1])
            with open(file, 'wb') as outputStream:
                output.write(outputStream)  
        else : continue

Upvotes: 1

Views: 6908

Answers (4)

Ahmed I. Elsayed
Ahmed I. Elsayed

Reputation: 2110

Late but, you may be opening an invalid PDF file or an empty file that's named x.pdf and you think it's a PDF file

Upvotes: -1

ZeevhY Org.
ZeevhY Org.

Reputation: 365

This error raised up because of PDF file is empty. My PDF file was empty that's why my error was raised up. So First of all i fill my PDF file with some data and Then start reeading it using PyPDF2.PdfFileReader,

And it solved my Problem!!!

Upvotes: 0

AlliDeacon
AlliDeacon

Reputation: 1495

Using open(file, 'rb') was causing the issue becuase PdfFileReader() does that automagically. I just removed the with statement and that corrected the problem.

with open(file, 'rb') as input:
    inputStream = PyPDF2.PdfFileReader(input)

Upvotes: 1

user707650
user707650

Reputation:

I think your problem might be caused by the fact that you use the same filename to both open and write to the file, opening it twice:

with open(file, 'rb') as input :
    with open(file, 'wb') as outputStream :

The w mode will truncate the file, thus the second line truncates the input.
I'm not sure what you're intention is, because you can't really try to read from the (beginning) of the file, and at the same time overwrite it. Even if you try to write to the end of the file, you'll have to position the file pointer somewhere. So create an extra output file that has a different name; you can always rename that output file to your input file after both files are closed, thus overwriting your input file.

Or you could first read the complete file into memory, then write to it:

with open(file, 'rb') as input:
    inputStream = PyPDF2.PdfFileReader(input)
    output = PyPDF2.PdfFileWriter()
    output = input.encrypt(EmailDict[ID][1])
with open(file, 'wb') as outputStream:
    output.write(outputStream)  

Notes:

  • you assign inputStream, but never use it
  • you assign PdfFileWriter() to output, and then assign something else to output in the next line. Hence, you never used the result from the first output = line.

Please check carefully what you're doing, because it feels there are numerous other problems with your code.


Alternatively, here are some other tips that may help:

The documentation suggests that you can also use the filename as first argument to PdfFileReader:

stream – A File object or an object that supports the standard read and seek methods similar to a File object. Could also be a string representing a path to a PDF file.

So try:

inputStream = PyPDF2.PdfFileReader(file)

You can also try to set the strict argument to False:

strict (bool) – Determines whether user should be warned of all problems and also causes some correctable problems to be fatal. Defaults to True.

For example:

inputStream = PyPDF2.PdfFileReader(file, strict=False)

Upvotes: 3

Related Questions