Gina
Gina

Reputation: 23

grep using a list to find matches in a file, and print only the first occurrence for each string in the list

I have a file, for example, "queries.txt" that has hard return separated strings. I want to use this list to find matches in a second file, "biglist.txt".

"biglist.txt" may have multiple matches for each string in "queries.txt". I want to return only the first hit for each query and write this to another file.

grep -m 1 -wf queries.txt biglist.txt > output

only gives me one line in output. I should have output that is the same number of lines as queries.txt.

Any suggestions for this? Many thanks! I searched for past questions but did not find one that was exactly the same sort of case after a few minutes of reading.

Upvotes: 2

Views: 5971

Answers (3)

ps3udonym
ps3udonym

Reputation: 1

I might not fully understand your question, but it sounds like something like this might work.

cat queries.txt | while read word; do grep "$word" biglist.txt | tee -a output.txt; done

Upvotes: 0

kbshimmyo
kbshimmyo

Reputation: 579

An alternate method without xargs (which one should indeed learn): (this method assumes there are no spaces in the lines in queries.txt)

cat queries.txt | while read target; do grep -m 1 $target biglist.txt; done > outr

Upvotes: 1

Floris
Floris

Reputation: 46375

If you want to "reset the counter" after each file, you could do

cat queries.txt | xargs -I{} grep -m 1 -w {} biglist.txt > output

This uses xargs to call grep once for each line in the input… should do the trick for you.

Explanation:

cat queries.txt   - produce one "search word" per line
xargs -I{}        - take the input one line at a time, and insert it at {}
grep -m 1 -w      - find only one match of a whole word
{}                - this is where xargs inserts the search term (once per call)
biglist.txt       - the file to be searched
> output          - the file where the result is to be written

Upvotes: 7

Related Questions