Reputation: 313
I would like to compare multiple columns from 2 files and NOT print lines matching my criteria. An example of this would be:
file1
apple green 4
orange red 5
apple yellow 6
apple yellow 8
grape green 5
file2
apple yellow 7
grape green 10
output
apple green 4
orange red 5
apple yellow 8
I want to remove lines where $1
and $2
from file1
correspond to $1
and $2
from file2
AND when $3
from file1
is smaller than $3
from file2
.
I can now only do the first part of the job, that is remove lines where $1
and $2
from file1
correspond to $1
and $2
from file2
(fields are separated by tabs):
awk -F '\t' 'FNR == NR {a[$1FS$2]=$1; next} !($1FS$2 in a)' file2 file1
Could you help me apply the last condition?
Many thanks in advance!
Upvotes: 0
Views: 2098
Reputation: 26471
What you are after is this:
awk '(NR==FNR){a[$1,$2]=$3; next}!(($1,$2) in a) && a[$1,$2] < $3))' <file2> <file1>
Upvotes: 3
Reputation: 23667
Store 3rd field value while building the array and then use it for comparison
$ awk -F '\t' 'FNR==NR{a[$1FS$2]=$3; next} !(($1FS$2 in a) && $3 > a[$1FS$2])' f2 f1
apple green 4
orange red 5
apple yellow 6
grape green 5
Better written as:
awk -F '\t' '{k = $1FS$2} FNR==NR{a[k]=$3; next} !((k in a) && $3 > a[k])' f2 f1
Upvotes: 3