Kasravnd
Kasravnd

Reputation: 107347

search a word inside a pdf in terminal linux without any app

Is it possible to use grep or other command and/or regex in order to search for a particular pattern within a PDF file?

Upvotes: 1

Views: 2764

Answers (3)

Jongware
Jongware

Reputation: 22478

Short: yes (use the flag -b for binary files).

But chances are high you will not find what you are looking for. PDF files are usually binary, compressed, and heavily encoded at that -- up to the point not even Acrobat Reader can copy sensible text out of it.

Upvotes: 1

R.Sicart
R.Sicart

Reputation: 681

try with:

cat file.pdf | strings | grep 'pattern'

Upvotes: 2

Freyja
Freyja

Reputation: 40944

If you have the pdftotext utility installed, you can use the following command to search through the text of a PDF file:

pdftotext myfile.pdf - | grep 'pattern'

You have to use some utility (such as pdftotext) to convert the PDF file to text before feeding it into grep (otherwise grep would have a hard time making sense out of the raw PDF file), but any utility that does this should work.

On Ubuntu and Debian, pdftotext is part of the poppler-utils package.

Upvotes: 3

Related Questions