clear.choi
clear.choi

Reputation: 855

Combine (Merge) Multiple Row using AWK

Hi I have one AWK Command which is combine two files have same key.

awk -v OFS='\t' '
NR==1   { print $0, "Column4", "Column5"; next }
NR==FNR { a[$1]=$0; next}
$1 in a { print a[$1], $2, $3 }
' $1 $2 > $3

This is return only one key from each files. For example as below,

File 1

Key    Column1  Column2  Column3  
Test1    500     400     200               
Test1    499     400     200               
Test1    499     399     200               
Test1    498     100     100               
Test2    600     200     150               
Test2    600     199     150               
Test2    599     199     100               

File 2

Test1    Good     Good                    
Test2    Good     Good

Then Results will be

Key    Column1  Column2  Column3  Column4  Column5
Test1    500     400     200       Good      Good   
Test2    600     200     150       Good      Good

but I want to make all rows have combined like below.

Key    Column1  Column2  Column3  Column4  Column5
Test1    500     400     200       Good      Good            
Test1    499     400     200       Good      Good             
Test1    499     399     200       Good      Good             
Test1    498     100     100       Good      Good             
Test2    600     200     150       Good      Good             
Test2    600     199     150       Good      Good              
Test2    599     199     100       Good      Good           

Anyone has idea simply to change logic using AWK. Thank you!C

Upvotes: 0

Views: 667

Answers (1)

Wintermute
Wintermute

Reputation: 44043

I think you're looking for

join file1 file2

If you insist on doing it with awk, a good way would be to process the files the other way around, so that you have the parts you want to add ready when you process the main file:

awk -v OFS='\t' '
  FNR == NR { a[$1] = $2 OFS $3; next }
  { $1 = $1 }
  FNR ==  1 { print $0, "Column4", "Column5" }
  FNR !=  1 { print $0, a[$1] }
  ' "$2" "$1" > "$3"

EDIT: @EtanReisner suggested the addition of { $1 = $1 }. The purpose of this is to force awk to rebuild the line from the fields, so that input data that is split by a mixture of whitespaces comes out uniformly separated by OFS (tab in this case). If the data is already tab-separated, it is not necessary (but doesn't hurt).

Upvotes: 2

Related Questions