Reputation: 47

Bash Unix search for a list of words in multiple files

I have a list of words I need to check in more one hundred text files.

My list of word's file named : word2search.txt.

This text file contains N word :

Word1
Word2
Word3
Word4
Word5
Word6
Wordn

So far I've done this bash file :

#!/bin/bash

listOfWord2Find=/home/mobaxterm/MyDocuments/word2search.txt

while IFS= read -r listOfWord2Find
do
    echo "$listOfWord2Find"
    grep -l -R "$listOfWord2Find" /home/mobaxterm/MyDocuments/txt/*.txt
    echo "================================================================="
done <"$listOfWord2Find"

The result does not satisfy me, I can hardly exploit the result

Word1
/home/mobaxterm/MyDocuments/txt/new 6.txt
/home/mobaxterm/MyDocuments/txt/file1.txt
/home/mobaxterm/MyDocuments/txt/file2.txt
/home/mobaxterm/MyDocuments/txt/file3.txt
=================================================================
Word2
/home/mobaxterm/MyDocuments/txt/new 6.txt
/home/mobaxterm/MyDocuments/txt/file1.txt
=================================================================
Word3
/home/mobaxterm/MyDocuments/txt/new 6.txt
/home/mobaxterm/MyDocuments/txt/file4.txt
/home/mobaxterm/MyDocuments/txt/file5.txt
/home/mobaxterm/MyDocuments/txt/file1.txt
=================================================================
Word4
/home/mobaxterm/MyDocuments/txt/new 6.txt
/home/mobaxterm/MyDocuments/txt/file1.txt
=================================================================
Word5
/home/mobaxterm/MyDocuments/txt/new 6.txt
=================================================================

This is what i want to see :

/home/mobaxterm/MyDocuments/txt/file1.txt : Word1, Word2, Word3, Word4
/home/mobaxterm/MyDocuments/txt/file2.txt : Word1
/home/mobaxterm/MyDocuments/txt/file3.txt : Word1
/home/mobaxterm/MyDocuments/txt/file4.txt : Word3
/home/mobaxterm/MyDocuments/txt/file5.txt : Word3
/home/mobaxterm/MyDocuments/txt/new 6.txt : Word1, Word2, Word3, Word4, Word5, Word6

I do not understand why my script doesnt show me the Word6(there are files which contains this word6). It stops at word5. To avoid this issue, I've added a new line blablabla (I'm sure to not find this occurence).

If you can help me on this subject :) Thank you.

Upvotes: 0

Answers (3)

hek2mgl

Reputation: 158090

Just grep:

grep -f list.txt input.*.txt

-f FILENAME allows to use a file with patterns for grep to search.

If you want to display the filename along with the match, pass -H in addition to that:

grep -Hf list.txt input.*.txt

Upvotes: 0

Dudi Boy

Reputation: 4890

Another much more elegant approach to search all words on each file. One file at a time.

Use grep command multi pattern option -f, --file=FILE, and print matched lines with -o, --only-matching

Then to pipe massage the resulting words into csv list.

Like this:

script.sh

#!/bin/bash

for currFile in $*; do
  matched_words_list=$(grep --only-matching --file=$WORDS_LIST $currFile |sort|uniq|awk -vORS=', ' 1|sed "s/, $//")
  printf "%s : %s\n" "$currFile" "$matched_words_list"
done

script.sh output

Passing words list file in environment variable: WORDS_LIST

Passing inspected files list as arguments list input.*.txt

export WORDS_LIST=./words.txt; ./script.sh input.*.txt
input.1.txt : word1, word2
input.2.txt : word4
input.3.txt :

Explanation:

using words.txt:

word2
word1
word5
word4

using input.1.txt:

word1
word2
word3
word3
word1
word3