Brant Messenger
Brant Messenger

Reputation: 1451

How to extract text using Zend_Pdf from pdf page

Can anyone help with extracting text from a page in a pdf?

<?php
$pdf = Zend_Pdf::load('example.pdf');
$page = $pdf->page[0];

I would assume a page method would exist but I could not find anything to let me extract the contents.

Example: $page->getContents(); $page->toString(); $page->extractText();

...Help!!!! This is driving me crazy!

Upvotes: 2

Views: 4327

Answers (2)

Cal Jacobson
Cal Jacobson

Reputation: 2407

I agree with Andy that this does not appear to be supported. As an alternative, take a look at Shaun Farrell's solution to extracting text from a PDF for use with Zend_Search_Lucene. He uses XPDF, which might also meet your needs.

Upvotes: 2

Andy
Andy

Reputation: 17771

From the manual it doesn't appear that this functionality is supported. Also, new text is written using the drawText() function, which appears to write images, not plain "decodable" text.

Upvotes: 0

Related Questions