Amith Kotian
Amith Kotian

Reputation: 440

To print un-matching strings when compared from two files awk/grep/sed

There are two files file1 and file2 both the files contains some similar data. where file1 has some extra data which is not present in file2. i am trying to print those extra data

below awk solution will print only matching data from file1 and file2. i need something that data which is present in file1 but not in file2.

awk 'NR==FNR{patts[$1]=$2;next}{for (i in patts) if (($0 ~ i) && ($0 ~ patts[i])) print}' file2 file1 

file1

papaya
apple
Moosumbi
mango
jackfruit
kiwi
orange
strawberry
banana
grapes
dates

file2

apple
mango
kiwi
strawberry

expected result:-

papaya
Moosumbi
jackfruit
orange
banana
grapes
dates

Upvotes: 0

Views: 72

Answers (2)

Pierre François
Pierre François

Reputation: 6073

The command diff is made to that purpose. Just issue:

diff --changed-group-format='%<' --unchanged-group-format='' file1 file2

and you will get the expected result:

papaya
Moosumbi
jackfruit
orange
banana
grapes
dates

Upvotes: 1

Amith Kotian
Amith Kotian

Reputation: 440

This worked for me

awk -F, 'FNR==NR {f2[$1];next} !($0 in f2)' file2 file1

This means that the condition FNR==NR is only true for the first file, as FNR resets back to 1 for the first line of each file but NR keeps on increasing.

f2[$1] alone has the only purpose of creating the array element indexed by $1, even if we don't assign any value to it. During the pass over the first file2, all the lines seen are remembered as indexes of the array f2. The pass over the second file1 just needs to check whether each line being read doesn't exists as an index in the array f2 [that's what the condition !($0 in f2) does]. If the condition is true, the line being read from file2 is printed.

The condition FNR==NR compares the same two operands as NR==FNR, so it behaves in the same way.

Upvotes: 0

Related Questions