Alula
Alula

Reputation: 91

How can I match two files based on different columns using awk?

I need to match two files for each lines based on $5 and $6 from file 1, and $1 &$2 of file two and to print $3 of file2 in file 1 when it matches.

File 1

7   81  1   47  13412   2013-09-27      18:39:34    
7   82  2   10  27723   2010-08-20      18:46:10        
7   83  1   67  27030   2010-05-23      18:51:54        
7   84  2   17  13412   2013-09-28      18:54:19  

File 2

13412   2013-09-27   565  
27723   2010-08-20   582.333  
27030   2010-05-23   514  
13412   2013-09-28   565  
27723   2010-08-21   591.25  
27030   2010-05-24   527  
13412   2013-09-29   561.5  

I have tried the following code but it just print the first file without matching

awk ‘NR==FNR {a[$1,$2] = $3; next}   $0 = $0 FS a[$1,$2]’ file2 file1  

Upvotes: 1

Views: 54

Answers (1)

Kent
Kent

Reputation: 195229

This awk one-liner may give you a hand:

awk 'NR==FNR{a[$1 FS $2]=$3;next}
    {k=$5 FS $6}$0 = $0 ((k in a)? FS a[k]:"")' f2 f1

With given inputs, it outputs:

7 81 1 47 13412 2013-09-27 18:39:34 565
7 82 2 10 27723 2010-08-20 18:46:10 582.333
7 83 1 67 27030 2010-05-23 18:51:54 514
7 84 2 17 13412 2013-09-28 18:54:19 565

Your problem

  • You can use $x,$y to indicate a key, but when you processed file1, you should use $5,$6 instead of $1,$2

  • If the corresponding entry was not found in file2, you will anyway concatenate a FS in output. I don't know if it is desired.

Short explanation from @Olivier Dulac (comment below):

{k=...} (defition of k, occuring on each line of f1) on a separate line from the $0 = ... (redefinition of $0 by including at its end the value of a[k], if it exists) part, that is a "alwayus true" condition that then prints the new $0.

Upvotes: 2

Related Questions