ceiling cat
ceiling cat

Reputation: 5701

How can I programmatically remove a page from a PDF document on a Mac?

I have a bunch of PDF documents and all of them contain a title page that I want to remove.

Is there a way to programmatically remove them?

Most of the PDF utilities I found can only combine documents but not remove pages. In the print dialog I can choose page 2 to and then print to a file, but I can't find any way to access this function programmatically.

Upvotes: 8

Views: 4809

Answers (3)

Kurt Pfeifle
Kurt Pfeifle

Reputation: 90203

Just for the record: you can also use Ghostscript:

gs \
  -o removed-page-1-from-input.pdf \
  -sDEVICE=pdfwrite \
  -dFirstPage=2 \
  /path/to/input.pdf

However, pdftk is the better tool for that job (and was already recommended to you).

Also, this Ghostscript commandline could change some of the properties in your input.pdf because it essentially re-distills it. This could be a desired change or not. To control individual aspects of this behavior (or to suppress some of them), a more complicated commandline with more parameters is required.

pdftk will re-use the original PDF objects for each page as-is.


Update

Ghostscript has the additional parameter of -dLastPage too. Together with -dFirstPage this allows for the extraction of page ranges.

The newest versions sport an new parameter, -sPageList. This could be used like this:

-sPageList="1, 5-10, 12-"

to extract pages 1, 5-10 and 12-last from the input document. However, I've not (yet) personally tested this new feature and I'm not sure how reliably it works.

For older versions of Ghostscript (as well as the most recent one), it should work to feed the same input PDF multiple times with different parameters to same GS call to extract non-contiguous page selections from a document. You could even combine pages from different documents this way:

gs \
  -o selected-pages.pdf \
  -sDEVICE=pdfwrite     \
  -dFirstPage=2         \
  -dLastPage=2          \
   in1.pdf              \
                        \
  -dFirstPage=10        \
  -dLastPage=15         \
   in1.pdf              \
                        \
  -dFirstPage=1         \
  -dLastPage=1          \
   in1.pdf              \
                        \
  -dFirstPage=4         \
  -dLastPage=6          \
   in2.pdf

Caveats: Combining pages from different documents which use non-embedded fonts or identical font names but different encodings and/or different subsets (with identical fontname-prefixes) may lead to a faulty PDF in the result.

Upvotes: 7

JWWalker
JWWalker

Reputation: 22707

-[PDFDocument removePageAtIndex:] looks like it should make this possible. By the way, Preview.app can remove a page, but it isn't scriptable, so that's not a programmatic solution.

Upvotes: 0

Benoit
Benoit

Reputation: 79165

Use pdftk.

To remove page 8:

pdftk in.pdf cat 1-7 9-end output out.pdf

Upvotes: 11

Related Questions