Reputation: 107347
Is it possible to use grep
or other command and/or regex
in order to search for a particular pattern within a PDF file?
Upvotes: 1
Views: 2764
Reputation: 22478
Short: yes (use the flag -b
for binary files).
But chances are high you will not find what you are looking for. PDF files are usually binary, compressed, and heavily encoded at that -- up to the point not even Acrobat Reader can copy sensible text out of it.
Upvotes: 1
Reputation: 40944
If you have the pdftotext
utility installed, you can use the following command to search through the text of a PDF file:
pdftotext myfile.pdf - | grep 'pattern'
You have to use some utility (such as pdftotext
) to convert the PDF file to text before feeding it into grep
(otherwise grep
would have a hard time making sense out of the raw PDF file), but any utility that does this should work.
On Ubuntu and Debian, pdftotext
is part of the poppler-utils
package.
Upvotes: 3