Amy
Amy

Reputation: 109

How to join two files based on one column in AWK

I have two files both with millions of records in (not the same amount of records in each)

File A

46901208,(SPUR OFF) FFERM LLIDIART WERDD,WRECSAM
85315934,A4077 FROM CRUGHYWEL ROAD,CRUGHYWEL
46402576,A4085 COFNOD WEDI CAU,BEDDGELERT
46400001,A4085 CYLCHFAN CAEATHRO,CAEATHRO

File B

46408008,1
46400005,1
46400001,4
46402576,3

I want to merge the two files based on matching $1, the ones that don't match I don't want them being output.

Desired output:

46402576,A4085 COFNOD WEDI CAU,BEDDGELERT,3
46400001,A4085 CYLCHFAN CAEATHRO,CAEATHRO,4

How can I do this? Thanks

Upvotes: 1

Views: 6146

Answers (2)

luoluo
luoluo

Reputation: 5533

Try this

awk -F"," 'BEGIN{OFS=","} {if (NR==FNR) {a[$1]=$2; next} if ($1 in a) {print $1, $2, $3, a[$1]}}' f2 f1

UPDATE

Use awk -F"," -f your_awk.awk f2 f1 format. In your_awk.awk file:

BEGIN{OFS=","} {if (NR==FNR) {a[$1]=$2; next} if ($1 in a) {print $1, $2, $3, a[$1]}}

Upvotes: 2

Slava Semushin
Slava Semushin

Reputation: 15214

I know that you have asked about awk but there is special command to do that and it's called join:

1) If your files have already sorted:

$ join -j 1 -t, A B
46400001,A4085 CYLCHFAN CAEATHRO,CAEATHRO,4
46402576,A4085 COFNOD WEDI CAU,BEDDGELERT,3

Here we are joining files A and B by 1st field and use comma as separator of the fields.

2) If your files unsorted and you want to do all the things in one shot with bash:

$ join -j 1 -t, <(sort A) <(sort B)
46400001,A4085 CYLCHFAN CAEATHRO,CAEATHRO,4
46402576,A4085 COFNOD WEDI CAU,BEDDGELERT,3

Upvotes: 5

Related Questions