lebowski
lebowski

Reputation: 1051

Using NOT operator in an IF with multiple grep commands

I am new to Shell scripting, and am writing a Korn shell script.

My aim is to search for each line in fileA.txt in 4 separate files (let's call them fileA.txt, fileB.txt, fileC.txt and fileD.txt). I need to print "not found" for the lines from fileA.txt that were found in neither of the four files in a separate file.

So I came up with the following If statement. I am trying to combine the 4 grep commands using &&, and doing a logical Not (!) since I only need the lines that were found in neither of the 4 files.

for i in $(<fileA.txt);
do
    if !((grep -q $i fileB.txt) && (grep -q $i fileB.txt) && (grep -q $i fileC.txt) && (grep -q $i fileD.txt)); then
        print "$i not found in either of 4 files"
    fi
done

I know there's something definitely wrong with the syntax, but being a beginner in shell scripting, I can't figure it out.

Upvotes: 0

Views: 134

Answers (2)

dave_thompson_085
dave_thompson_085

Reputation: 39000

It doesn't answer the question you asked, and thus violates SO policy, but there's a way to solve your actual problem with awk in one pass that I can't fit in a reasonable comment:

 awk 'FNR==NR{a[$0];next} {for(p in a)if($0~p){delete a[p]}} \
   END{for(p in a)print "notfound: ",p}' patternfile data1 data2 data3 etc

The notfound: is just for clarity, you can change or omit it as desired.

The output values (patterns that were not found in any data file) are not necessarily in the same order as they were in patternfile; if you care about that:

 awk 'FNR==NR{a[$0]=FNR;next} {for(p in a)if($0~p){delete a[p]}} \
   END{for(p in a)print a[p],p}' patternfile data1 data2 data3 etc | sort -k1n | cut -f2-
 # or in GNU awk v4+ only
 awk 'FNR==NR{a[$0]=FNR;next} {for(p in a)if($0~p){delete a[p]}} \
   END{PROCINFO["sorted_in"]="@val_num_asc";for(p in a)print p}' patternfile data1 data2 data3 etc 

Your question is also ambiguous about 'lines'; do you mean each line in patternfile should occur as a line in one of the data files, or can it occur within a line but not necessarily the whole line? Also, are the values in the patternfile only data characters or are any of them special characters that match something different in the data? For example with grep defaults as you posted (or awk with ~ as I have above) if patternfile contains a line boojum.. that item will be considered found if a data file contains any of the following lines:

 boojum..
 boojumXY
 the snark was a boojum!!

OTOH the patternfile line ^abc will match:

 abc
 abcdefghi

but will NOT match:

 ^abc

You can get full-line match in grep with option -x, literal (non-regex) match with -F, or both. These can also be achieved in awk but differently.

Upvotes: 2

chepner
chepner

Reputation: 532093

You don't need the parentheses. In fact, because you are using &&, you don't need 3 separate calls to grep.

while IFS= read -r line; do
  if ! grep -q "$i" fileB.txt fileC.txt fileD.txt; then
    print "$i not found in any of the 3 files"
  fi
done < fileA.txt  

You don't even need the loop; this pattern is covered by the -f option:

if ! grep -f fileA.txt fileB.txt fileC.txt fileD.txt; then
   ...
fi

Upvotes: 1

Related Questions