sonubreeze
sonubreeze

Reputation: 115

how to extract text from pdf using ghostscript in php

I am trying to extract text from pdf using following command but it isn't working and returning null.

$text = shell_exec(gs -q -sDEVICE=txtwrite -dBATCH -dNOPAUSE -dFirstPage='.(int)$page_number.' -dLastPage='.(int)($page_number+1).' -sOutputFile=textfilename.txt exemple.pdf');

Upvotes: 0

Views: 1208

Answers (1)

miken32
miken32

Reputation: 42701

You don't have a string as a parameter there and you need to escape values before passing them to the command. Finally, you need to specify an output file, in this case you want the data to go to STDOUT for access by PHP.

$first_page = escapeshellarg((int)$page_number);
$last_page = escapeshellarg($page_number + 1);
$text = shell_exec("gs -q -sDEVICE=txtwrite -dBATCH -dNOPAUSE -dFirstPage=$first_page -dLastPage=$last_page -sOutputFile=%stdout exemple.pdf");

Upvotes: 1

Related Questions