Reputation: 109
I have two files both with millions of records in (not the same amount of records in each)
File A
46901208,(SPUR OFF) FFERM LLIDIART WERDD,WRECSAM
85315934,A4077 FROM CRUGHYWEL ROAD,CRUGHYWEL
46402576,A4085 COFNOD WEDI CAU,BEDDGELERT
46400001,A4085 CYLCHFAN CAEATHRO,CAEATHRO
File B
46408008,1
46400005,1
46400001,4
46402576,3
I want to merge the two files based on matching $1, the ones that don't match I don't want them being output.
Desired output:
46402576,A4085 COFNOD WEDI CAU,BEDDGELERT,3
46400001,A4085 CYLCHFAN CAEATHRO,CAEATHRO,4
How can I do this? Thanks
Upvotes: 1
Views: 6146
Reputation: 5533
Try this
awk -F"," 'BEGIN{OFS=","} {if (NR==FNR) {a[$1]=$2; next} if ($1 in a) {print $1, $2, $3, a[$1]}}' f2 f1
UPDATE
Use awk -F"," -f your_awk.awk f2 f1
format. In your_awk.awk
file:
BEGIN{OFS=","} {if (NR==FNR) {a[$1]=$2; next} if ($1 in a) {print $1, $2, $3, a[$1]}}
Upvotes: 2
Reputation: 15214
I know that you have asked about awk
but there is special command to do that and it's called join
:
1) If your files have already sorted:
$ join -j 1 -t, A B
46400001,A4085 CYLCHFAN CAEATHRO,CAEATHRO,4
46402576,A4085 COFNOD WEDI CAU,BEDDGELERT,3
Here we are joining files A
and B
by 1st field and use comma as separator of the fields.
2) If your files unsorted and you want to do all the things in one shot with bash
:
$ join -j 1 -t, <(sort A) <(sort B)
46400001,A4085 CYLCHFAN CAEATHRO,CAEATHRO,4
46402576,A4085 COFNOD WEDI CAU,BEDDGELERT,3
Upvotes: 5