Reputation: 1724
I use package pdf-to-image for Yii2 based on Imagick library to convert each page of PDF to image. Also I need to get width and height or format of particular PDF page. Is there any way to do that?
Upvotes: 0
Views: 3763
Reputation: 1724
How about this approach?
By Imagick I can easily get image from pdf file
$RESOLUTION = 300;
$myurl = 'filename.pdf['.$pagenumber.'];'
$image = new Imagick($myurl);
$image->setResolution( $RESOLUTION , $RESOLUTION );
$image->setImageFormat( "png" );
$image->writeImage('newfilename.png');
Now I have image from page of PDF file. I know resolution (number pixels per inch) and I can get width and height of image in pixels. So don't need to have deep knowledge in math to calculate width and height of page of PDF in inch:
$pdfPageWidth = $imageWidth / $RESOLUTION;
$pdfPageHeight = $imageHeght/ $RESOLUTION;
Upvotes: 0
Reputation: 99
Imagick is a native php extension to create and modify images using the ImageMagick API. So doesn't retry any PDF's info but image's info:
Imagick::getNumberImages — Returns the number of images in the object.
$pdf->getNumberOfPages(); //returns number of images that are equal to number of PDF's pages. This is a method from pdf-to-image package.
A PDF describes the content and appearance of one or more pages. It also contains a definition of the physical size of those pages. That page size definition is not as straightforward as you might think. There can in fact be up to 5 different definitions in a PDF that relate to the size of its pages. These are called the boundary boxes or page boxes.
The MediaBox is used to specify the width and height of the page. For the average user, this probably equals the actual page size.
Each page in a PDF can have different sizes for the various page boxes.
A PDF always has a MediaBox definition. All the other page boxes do not necessarily have to be present in regular PDF files.
The MediaBox is the largest page box in a PDF. The other page boxes can equal the size of the MediaBox but they are not expected to be larger (The latter is explicitly required in the PDF/X-4 requirements). If they are larger, the PDF viewer will use the values of the MediaBox.
You should be able to retrieve the exact "HiResBoundingBox" value (which is the MediaBox value in PDF).
A test document is A4 (210mm x 297mm) which is 595.28pt x 841.89pt and has four(4) pages.
The unit of these values is PostScript points (where 72 pt == 1 inch).
$pdf = "1.pdf";
$output = shell_exec("identify -format \"%[pdf:HiResBoundingBox]\" $pdf");
echo $output;
prints this String:
595.28x841.89+0+0595.28x841.89+0+0595.28x841.89+0+0595.28x841.89+0+0
with some REGEX you could get width:595.28pt and height:841.89pt for each page and convert them to millimeters.
Upvotes: 1