Zez
Zez

Reputation: 135

Find files whose content match a line from text file

I have a text file - accessions.txt (below is a subset of this file):

KRO94967.1
KRO95967.1
KRO96427.1
KRO94221.1
KRO94121.1
KRO94145.1
WP_088442850.1
WP_088252850.1
WP_088643726.1
WP_088739685.1
WP_088283155.1
WP_088939404.1

And I have a directory with multiple files (*.align).

I want to find the filenames (*.align) which content matches any line within my accessions.txt text file.

I know that find . -exec grep -H 'STRING' {} + works to find specific strings (e.g replacing STRING with WP_088939404.1 returns every filename where the string WP_088939404.1 is present).

Is there a way to replace STRING with "all strings inside my text file" ?

Or

Is there another (better) way to do this?

I was trying to avoid writing a loop that reads the content of all my files as there are too many of them.

Many thanks!

Upvotes: 2

Views: 1831

Answers (2)

choroba
choroba

Reputation: 242208

grep can take a list of patterns to match with -f.

grep -lFf accessions.txt directory/*.align

-F tells grep to interpret the lines as fixed strings, not regex patterns.

Sometimes, -w is also needed to prevent matching inside words, e.g.

abcd

might match not only abcd, but also xabcd or abcdy. Sometimes, preprocessing the input list is needed to prevent unwanted matching if the rules are more complex.

Upvotes: 1

oguz ismail
oguz ismail

Reputation: 50805

You're looking for grep's -f option.

find . -name '*.align' -exec grep -Fxqf accessions.txt {} \; -print

Upvotes: 1

Related Questions