Partial string search between two files using AWK

Question

I have been trying to re-write an egrep command using awk to improve performance but haven't been successful. The egrep command performs a simple case insensitive search of the records in file1 against (partial matches in) file2. Below is the command and sample output.

file1 contains:

Abc
xyz
123
blah
hh
a,b

file2 contains:

abc de
xyz
123
456
blah
test1
abdc
abc,def,123
kite
a,b,c

Original command : egrep -i -f file1 file2

Original (egrep) command output :

$ egrep -i -f file1 file2
abc de
xyz
123
blah
abc,def,123
a,b,c

I would like to use AWK to rewrite the command to do the same operation. I have tried the below but it is performing a full record match and not partial like grep does.

Modified command in awk : awk 'NR==FNR{a[tolower($0)];next} tolower($0) in a' file1 file2

Modified command (awk) output:

$ awk 'NR==FNR{a[tolower($0)];next} tolower($0) in a' file1 file2
xyz
123
blah

This excludes the records which had partial matches for the string "abc". Any help to fix the awk command please? Thanks in advance.

oguz ismail · Accepted Answer

Use index like this for a partial literal match:

awk '
NR == FNR {
  needles[tolower($0)]
  next
}
{
  haystack = tolower($0)
  for (needle in needles) {
    if (index(haystack, needle)) {
      print
      break
    }
  }
}' file1 file2

Partial string search between two files using AWK

Answers (2)

Related Questions