Tatui1969
Tatui1969

Reputation: 51

How to search contents of multiple pdf files and return the pdf's file name?

I did a search here and found this one:

find /path -name '*.pdf' -exec pdftotext {} - \; | grep "your query"

However, it returns the text block inside the pdf files that have "your query". Have a method to return the file name instead?

Upvotes: 2

Views: 1004

Answers (2)

dirkgently
dirkgently

Reputation: 111130

As suggsted by Neil: you can use the -l option. If you need the count of matches too, you can try this:

find /path -name '*.pdf' -exec pdftotext {} - \; | grep -H -c "your query"

The -H option prints the filename and the -c option prints the count. You can strip the count out later of course.

Upvotes: 2

Neil
Neil

Reputation: 55392

This lists all the files whose text conversion matches your query:

find /path -name '*.pdf' -exec sh -c "pdftotext {} - | grep --label {} -l 'your query'" \;

Upvotes: 2

Related Questions