Reputation: 49
I have two files A.txt and B.txt containing two lists respectively as shown bellow.
File A.txt
hello
hi
ko
File B.txt
fine
No
And how
why
Now I want to check presence of any of these words (from A.txt AND B.txt) in a line in another file C.txt.
I am using the grep command
grep -iof A.txt C.txt| grep B.txt
C.txt contains sentences containing words from A.txt and B.txt
Hello I am fine
I am not fine
why ko is and how?
doesn't show any output
So, now I want if any word from A.txt and B.txt present simultaneously in one sentence it should show the output as
Hello fine
why ko and how
To print only the matching words from both files if they occur simultaneously in C.txt, instead of printing the whole line from C.txt
Upvotes: 1
Views: 1500
Reputation: 289495
You probably want to say:
$ grep -if B <(grep -if A C)
Hello I am fine
why ko is and how?
This uses -f
to provide the expressions. It can be a file... or a file you create on the fly with the process substitution <( ... )
.
Firstly, grep -if A C
matches all the words in C
that are in A
:
$ grep -if A C
Hello I am fine # "Hello" highlighted
why ko is and how? # "ko" highlighted
Then, its output is compared with the content in B
.
$ grep -if B <(grep -if A C)
Hello I am fine # "fine" highlighted
why ko is and how? # "and how" highlighted
Depending on your needs, you may want to add -F
, -w
and -i
.
From man grep
:
-f FILE, --file=FILE
Obtain patterns from FILE, one per line. The empty file
contains zero patterns, and therefore matches nothing. (-f is
specified by POSIX.)
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings, separated by
newlines, any of which is to be matched. (-F is specified by
POSIX.)
-i, --ignore-case
Ignore case distinctions in both the PATTERN and the input
files. (-i is specified by POSIX.)
-w, --word-regexp
Select only those lines containing matches that form whole
words. The test is that the matching substring must either be
at the beginning of the line, or preceded by a non-word
constituent character. Similarly, it must be either at the end
of the line or followed by a non-word constituent character.
Word-constituent characters are letters, digits, and the
underscore.
Upvotes: 3