Reputation: 3
I need to remove the first page of multiple pdf files in a directory. I am an elementary level python user and I have cobbled together the following code from bits & pieces of other code that I have. However, I cannot get it to work. Does anything jump out at anyone?
from PyPDF2 import PdfFileWriter, PdfFileReader
import os, sys
directory_name = 'emma'
for filename in directory_name:
print 'name: %s' % filename
output_file = PdfFileWriter()
input_handle = open(filename+'.pdf', 'rb')
input_file = PdfFileReader(input_handle)
num_pages = input_file.getNumPages()
print "document has %s pages \n" % num_pages
for i in xrange(1, num_pages):
output_file.addPage(input_file.getPage(i))
print 'added page %s \n' % i
output_stream = file(filename+'-stripped.pdf','wb')
output_file.write(output_stream)
output_stream.close()
input_handle.close()
Error message:
input_handle = open(filename+'.pdf', 'rb')
IOError: [Errno 2] No such file or directory: 'a.pdf'
Upvotes: 0
Views: 2518
Reputation: 1
I adapted the code to Python 3, just in case somebody wants to use it:
from PyPDF2 import PdfWriter, PdfReader
import os, glob, sys
os.chdir(r'data_path')
filename_lst = glob.glob('*.pdf')
print('number of files: {}'.format(len(filename_lst)))
save_path = '...' # if you want to save the results somewhere else
for filename in filename_lst:
print('name: {}'.format(filename))
output_file = PdfWriter()
input_handle = open(filename, 'rb')
input_file = PdfReader (input_handle)
num_pages = len(input_file.pages)
print("document has {} pages \n".format(num_pages))
for i in range(1, num_pages):
output_file.add_page(input_file.pages[i])
output_stream = open(save_path + filename, 'wb')
output_file.write(output_stream)
output_stream.close()
input_handle.close()
Upvotes: 0
Reputation: 4392
Your code iterates over "emma" and tries to open e.pdf
, m.pdf
(twice), a.pdf
. Your error on a.pdf
means the first two actually exist, which is interesting enough on its own.
But to your problem, you need to use os.listdir or glob to actually get the filenames within the directory.
Upvotes: 1