Reputation: 6959
I am building a simple script which converts the first page of a PDF into an image using Ghostscript. Here is the command I use:
gs -q -o output.png -sDEVICE=pngalpha -dLastPage=1 input.pdf
This works beautifully with some PDFs, e.g. if I convert the first page of a PDF that looks like this:
I actually get this first page as an image and there aren't any problems.
But I have noticed that with some first pages of other PDFs, like the following:
With the same gs
command, after the conversion, the .png image looks like this:
The problem is that I get this extra white space on the left inside the image when I convert that page, why does GhostsScript do this? Where does that extra blank white space come from?
Upvotes: 5
Views: 2302
Reputation: 90315
Most likely, your PDFs do not use identical values for /MediaBox
and for /CropBox
. For details about these technical terms related to a page, see this illustration from the German Wikipedia:
In other words: the /CropBox
values (if given) for a PDF page determines which (smaller) part of the overall page information (which is inside the /MediaBox
) the PDF viewer should be made visible to the user (or to the printer).
To determine what are the different values for all the pages of your book(s), run this command:
pdfinfo -f 1 -l 1000 -box my.pdf
To see these values just for the first page, run
pdfinfo -l 1 -box my.pdf
For Ghostscript to give the results you want, add -dUseCropBox
to your command line:
gs -q -o output.png -sDEVICE=pngalpha -dLastPage=1 -dUseCropBox input.pdf
Upvotes: 7