Human
Human

Reputation: 866

How can I merge pdf files together and take only the first page from each file?

I am using qpdf to merge all pdf files in a directory and I would like to merge only the first page of multiple inputfiles. According to the qpdf documentation on page selection this should be possible. I have tried couple variants without luck:

qpdf --empty --pages *.pdf 1-1 -- "output.pdf"
qpdf --empty --pages *.pdf 1 -- "output.pdf"

What can I do?

Upvotes: 3

Views: 7181

Answers (3)

user1767316
user1767316

Reputation: 3651

For more flexibility: qpdf --empty --pages pdf1.pdf 1 pdf2.pdf 1 pdf3.pdf 1 -- "output.pdf":

you can adapt the pages to be merged by ordering the pdf input files

qpdf --empty --pages pdf3.pdf 1 pdf1.pdf 1 pdf2.pdf 1 -- "output.pdf"

and if needed by repeating some input source and by using more advanced page configuration:

qpdf --empty --pages pdf3.pdf 1 pdf1.pdf 2-5,7 pdf3.pdf 6,8-10 pdf2.pdf 11 -- "output.pdf"

Upvotes: 0

Jordan
Jordan

Reputation: 1

The following piece of code worked for me very well.

import os
from PyPDF2 import PdfWriter, PdfReader

pdf_files = []
# Get all PDF documents in current directory
for filename in os.listdir("."):
    if filename.endswith(".pdf"):
        pdf_files.append(filename)
pdf_files.sort(key=str.lower)

# Take first page from each PDF    

pdf_writer = PdfWriter()

for filename in pdf_files:
    reader = PdfReader(filename)
    page = reader.pages[0]
    pdf_writer.add_page(page)


with open("CombinedFirstPages.pdf", "wb") as fp:
    pdf_writer.write(fp)

Upvotes: -1

Human
Human

Reputation: 866

As explained in this qpdf issue, the shell expands *.pdf in the command qpdf --empty --pages *.pdf 1 -- "output.pdf", that means it replaces *.pdf with a list of pdf files in the current directory. Assuming you have the following pdf files in the current directory:

  • file1.pdf
  • file2.pdf
  • file3.pdf

the command becomes:

qpdf --empty --pages file1.pdf file2.pdf file3.pdf 1 -- "output.pdf"

so the page selector is only applied to the last pdf. On a Mac or Linux you can script the command to add a 1 after each pdf-filename, to take the first page of each pdf file and put it all together like so:

qpdf --empty --pages $(for i in *.pdf; do echo $i 1; done) -- output.pdf

Upvotes: 6

Related Questions