user8450148
user8450148

Reputation:

How to merge two files with the same column value in bash

I have 2 csv files, these are their contents.

file1(23 fields)

data11,data12,ID1,data14...

data21,data22,ID2,data24...

data31,data32,ID3,data34...

file2 (22 fields)

ID1,value12,value13,...

ID1,value22,value23,...

ID1,value32,value33,...

ID2,value42,value43,... 

ID3,value52,value53,...

The output should be...

OUTPUT:

 data11,data12,ID1,data14,...,value12,value13

 data11,data12,ID1,data14,...,value22,value23

 data11,data12,ID1,data14,...,value32,value33

 data21,data22,ID2,data24,...,value42,value43

 data31,data32,ID3,data34,...,value52,value53

Can anyone help me to get this output using awk or any bash built-ins? Thanks!

Upvotes: 0

Views: 465

Answers (2)

chason
chason

Reputation: 3654

Sorry, my fault to misunderstand ur problem, try the following cmd, it should be what u want:

for line1 in `cat file1`;do id=`echo $line1|awk -F ',' '{print $3}'`;\
awk -v id=$id -v line1=$line1 -F ',' '($1==id){print line1","$0}' file2;done

the output of this cmd is

data11,data12,ID1,data14...,ID1,value12,value13,...
data11,data12,ID1,data14...,ID1,value22,value23,...
data11,data12,ID1,data14...,ID1,value32,value33,...
data21,data22,ID2,data24...,ID2,value42,value43,... 
data31,data32,ID3,data34...,ID3,value52,value53,...

and if u don't want the repeated column of ID*, u can do this like

for line1 in `cat file1`;do id=`echo $line1|awk -F ',' '{print $3}'`;\
awk -v id=$id -v line1=$line1 -F ',' '($1==id){printf "%s",line1;\
for(i=2;i<NF;i++) printf ",%s",$i;print ","$NF}' file2;done

it won't print ID* in file2

data11,data12,ID1,data14...,value12,value13,...
data11,data12,ID1,data14...,value22,value23,...
data11,data12,ID1,data14...,value32,value33,...
data21,data22,ID2,data24...,value42,value43,... 
data31,data32,ID3,data34...,value52,value53,...

----------wrong answer prior to updated---------

https://www.computerhope.com/unix/upaste.htm HI,

u can use paste cmd to join related lines of different files

please use man paste cmd for detailed usage

Upvotes: 0

Kaushik Nayak
Kaushik Nayak

Reputation: 31656

You can use join ..Specify the column order required for output after -o eg: 1.1 refers 1st column of 1st (file1) file. It is also required to pre-sort the input files

join -t "," -1 3 -2 1  -o 1.1,1.2,1.3,1.4,2.2,2.3 
<( sort -t "," -k3 /tmp/file1 ) <( sort -t "," -k1 /tmp/file2 ) 

Upvotes: 3

Related Questions