OXXO
OXXO

Reputation: 724

Merge files where some columns matched

Match columns 1,2,3 in both files, if they are equal then.

For files where columns match write value of column 4 in file1 into file2 If there is not match then write NA

file1

31431 37150 100 10100
31431 37201 100 12100
31431 37471 100 14100

file2

31431 37150 100 14100
31431 37131 100 14100
31431 37201 100 14100
31431 37478 100 14100
31431 37471 100 14100

Desired output:

31431 37150 100 14100 10100
31431 37131 100 14100 NA
31431 37201 100 14100 12100
31431 37478 100 14100 NA
31431 37471 100 14100 14100

I tried

awk '
FNR==NR{
  a[$1 $2 $3]=$4
  next
}
($1 in a){
  $1=a[$1]
  found=1
}
{
  $0=found==1?$0",":$0",NA"
  sub(/^...../,"&,")
  $1=$1
  found=""
}
1
' FS=" " file1 FS=" " OFS="," file2

Upvotes: 2

Views: 90

Answers (2)

RavinderSingh13
RavinderSingh13

Reputation: 133428

Could you please try following.

awk 'FNR==NR{a[$1,$2,$3]=$NF;next} {print $0,($1,$2,$3) in a?a[$1,$2,$3]:"NA"}' Input_file1  Input_file2

OR with creating a variable for fields as per Ed sir's comment.

awk '{var=$1 OFS $2 OFS $3} FNR==NR{a[var]=$NF;next} {print $0,var in a?a[var]:"NA"}' Input_file1  Input_file2

Output will be as follows.

31431 37150 100 14100 10100
31431 37131 100 14100 NA
31431 37201 100 14100 12100
31431 37478 100 14100 NA
31431 37471 100 14100 14100

Explanation: Adding explanation for above code now.

awk '
{
  var=$1 OFS $2 OFS $3              ##Creating a variable named var whose value is first, second ansd third field of current lines of Input_file1 and Input_file2.
}
FNR==NR{                            ##Checking condition FNR==NR which will be TRUE when first Input_file1 is being read.
  a[var]=$NF                        ##Creating an array named a whose index is variable var and value is $NF of curent line.
  next                              ##next keyword will skip all further lines from here.
}
{
  print $0,var in a?a[var]:"NA"     ##Printing current line value and along with that printing either value of a[var] or NA based upon if var present in array a then print a[var] else print NA.
}'  Input_file1  Input_file2        ##Mentioning Input_file names here.

Upvotes: 3

karakfa
karakfa

Reputation: 67467

$ awk '      {k=$1 FS $2 FS $3} 
     NR==FNR {a[k]=$4; next} 
             {$(NF+1)=k in a?a[k]:"NA"}1' file1 file2

31431 37150 100 14100 10100
31431 37131 100 14100 NA
31431 37201 100 14100 12100
31431 37478 100 14100 NA
31431 37471 100 14100 14100

Upvotes: 3

Related Questions