Reputation: 43
I am trying to use PdfFileMerger() in PyPDF2 to merge pdf files (see code).
from PyPDF2 import PdfFileMerger, PdfFileReader
[...]
merger = PdfFileMerger()
if (some condition):
merger.append(PdfFileReader(file(filename1, 'rb')))
merger.append(PdfFileReader(file(filename2, 'rb')))
if (test for non-zero file size):
merger.write("output.pdf")
However, my merge commands are subject to certain conditions and it could turn out that no merged pdf file is generated. I would like to know how to determine the page count after performing merges using PdfFileMerger(). If nothing else, I would like to know if the number of pages is non-zero. Maintaining a counter to do this would be cumbersome because I am performing the merges across several functions and would prefer a more elegant solution.
Upvotes: 2
Views: 3005
Reputation: 28
I'm +- in the same case as you. I will explain my solution. I'm not opening the PDFs with PdfFileReader('filename.pdf', 'rb')
but I'm passing the pdfs content in an array for the merge (pdfs_content_array
). Then I'm preparing the merger and my output (don't want to save the generated file locally so I have to use BytesIO to save the merged content somewhere) calc_page_sum
is needed to compare the page number results. The most important part is: calc_page_sum += PdfFileReader(bytes_content).getNumPages()
so I open the bytes content with PdfFileReader and get the pages number. Then I'm appending the merger ... merger.append,bytes_content
I'm writing the merge into my bytes output and compare it with the calc_page_sum. That's it.
from PyPDF2 import PdfFileMerger, PdfFileReader
import io
[...]
def merge_the_pdfs(self,pdfs_content_array,output_file):
merger = PdfFileMerger()
output = io.BytesIO()
calc_page_sum = 0
for content in pdfs_content_array:
bytes_content = io.BytesIO(content)
calc_page_sum += PdfFileReader(bytes_content).getNumPages()
yield self.application.cpupool.submit(merger.append,bytes_content)
merger.write(output)
if not calc_page_sum == PdfFileReader(output).getNumPages():
return None
return output.getValue()
Hope this will help!
2nd Version:
from PyPDF2 import PdfFileMerger, PdfFileReader
import io
import sys
filename1 = 'test.pdf'
filename2 = 'test1.pdf'
merger = PdfFileMerger()
output = io.BytesIO()
calc_page_sum = 0
filesarray = [filename1,filename2]
for singlefile in filesarray:
calc_page_sum += PdfFileReader(singlefile, 'rb').getNumPages()
merger.append(PdfFileReader(singlefile, 'rb'))
merger.write(output)
print(calc_page_sum)
print(PdfFileReader(output).getNumPages())
if calc_page_sum == PdfFileReader(output).getNumPages():
print("It worked")
merger.write("merging-test.pdf")
sys.exit()
print("Didn't worked")
sys.exit()
Upvotes: 1
Reputation: 31
maybe you can try to use the following
if len(merger.pages) > 0
for your condition
if (test for non-zero file size)
Upvotes: 3