tonix
tonix

Reputation: 6959

When converting first page of a PDF into an image using Ghostscript, sometimes I get "extra" space. Why?

I am building a simple script which converts the first page of a PDF into an image using Ghostscript. Here is the command I use:

gs -q -o output.png -sDEVICE=pngalpha -dLastPage=1 input.pdf 

This works beautifully with some PDFs, e.g. if I convert the first page of a PDF that looks like this:

enter image description here

I actually get this first page as an image and there aren't any problems.

But I have noticed that with some first pages of other PDFs, like the following:

enter image description here

With the same gs command, after the conversion, the .png image looks like this:

enter image description here

The problem is that I get this extra white space on the left inside the image when I convert that page, why does GhostsScript do this? Where does that extra blank white space come from?

Upvotes: 5

Views: 2302

Answers (1)

Kurt Pfeifle
Kurt Pfeifle

Reputation: 90315

Most likely, your PDFs do not use identical values for /MediaBox and for /CropBox. For details about these technical terms related to a page, see this illustration from the German Wikipedia:

In other words: the /CropBox values (if given) for a PDF page determines which (smaller) part of the overall page information (which is inside the /MediaBox) the PDF viewer should be made visible to the user (or to the printer).

Solution

To determine what are the different values for all the pages of your book(s), run this command:

pdfinfo -f 1 -l 1000 -box my.pdf

To see these values just for the first page, run

pdfinfo -l 1 -box my.pdf

For Ghostscript to give the results you want, add -dUseCropBox to your command line:

gs -q -o output.png -sDEVICE=pngalpha -dLastPage=1 -dUseCropBox input.pdf 

Upvotes: 7

Related Questions