Reputation: 2115
I have following two files, I need to join them based on the value of three columns, and if not matched print NA in the unmatched column.
cat f1
AAA 0 node4 Activated Unreachable down
AAA 1 node3 Activated Pingable cool
cat f2
AAA 0 node3 XYZ Active
Currently I am getting incorrect output using:
awk 'NR==FNR{a[$1]=$1;b[$2]=$2;c[$3]=$3;next} $1 in a && $2 in b && $3 in c{print $0}' f1 f2
AAA 0 node3 XYZ Active
Desired output:
AAA 0 node4 Activated Unreachable down NA
AAA 1 node3 Activated Pingable cool Active
Upvotes: 1
Views: 130
Reputation: 92854
awk approach:
awk 'NR==FNR{a[$1,$3]=$5; next}{$7="NA";if(($1,$3) in a){$7=a[$1,$3]} print}' f2 f1
The output:
AAA 0 node4 Activated Unreachable down NA
AAA 1 node3 Activated Pingable cool Active
a[$1,$3]=$5
- stores the value of the fifth field $5
in the second file f2
using combination of first $1
and third $3
fields as array key
$7="NA";
- initiates an additional seventh field $7
with default value "NA"
Upvotes: 1
Reputation: 85550
Use an Awk
logic as below,
awk 'FNR==NR{hash[$1FS$3]=$NF; next}{for(i in hash) if (match(i,$1FS$3)) { $(NF+1)=hash[i] } else { $(NF+1)="NA" } }1' f2 f1
which produces output as you needed.
AAA 0 node4 Activated Unreachable down NA
AAA 1 node3 Activated Pingable cool Active
The idea is to parse the second file first to store the status, index by the node value into the array hash
. Then on the first file, a loop on all indices and if the value in $3
on f1
matches the hashed value, print the status accordingly and it not found just print NA
.
Upvotes: 1