Agathe
Agathe

Reputation: 313

awk to compare multiple columns in 2 files

I would like to compare multiple columns from 2 files and NOT print lines matching my criteria. An example of this would be:

file1

apple  green  4
orange  red  5
apple  yellow 6
apple  yellow 8
grape  green 5

file2

apple  yellow 7
grape  green 10

output

apple  green  4
orange  red  5
apple  yellow 8

I want to remove lines where $1 and $2 from file1 correspond to $1 and $2 from file2 AND when $3 from file1 is smaller than $3 from file2. I can now only do the first part of the job, that is remove lines where $1 and $2 from file1 correspond to $1 and $2 from file2 (fields are separated by tabs):

awk -F '\t' 'FNR == NR {a[$1FS$2]=$1; next} !($1FS$2 in a)' file2 file1

Could you help me apply the last condition?

Many thanks in advance!

Upvotes: 0

Views: 2098

Answers (2)

kvantour
kvantour

Reputation: 26471

What you are after is this:

awk '(NR==FNR){a[$1,$2]=$3; next}!(($1,$2) in a) && a[$1,$2] < $3))' <file2> <file1>

Upvotes: 3

Sundeep
Sundeep

Reputation: 23667

Store 3rd field value while building the array and then use it for comparison

$ awk -F '\t' 'FNR==NR{a[$1FS$2]=$3; next} !(($1FS$2 in a) && $3 > a[$1FS$2])' f2 f1
apple   green   4
orange  red 5
apple   yellow  6
grape   green   5

Better written as:

awk -F '\t' '{k = $1FS$2} FNR==NR{a[k]=$3; next} !((k in a) && $3 > a[k])' f2 f1

Upvotes: 3

Related Questions