Reputation: 765
I have 2 files File1 and File2. File1 has some values separated by "|". For example,
A|a
C|c
F|f
File2 also has some values separated by "|". For example,
a|1
b|2
c|3
d|4
e|5
Means 2nd column in File1 is resembled with 1st column of File2.
I have to create 3rd file File3 with expected output
A|a|1
C|c|3
I tried to take each record in loop and searched for that in File2 using "awk".
It worked, but the problem is both File1 and File2 are having more than 5 million records.
I need an optimized solution.
Upvotes: 1
Views: 97
Reputation: 204628
This is what join
was created to do:
$ join -t '|' -o '1.1,1.2,2.2' -1 2 -2 1 file1 file2
A|a|1
C|c|3
man join
for more details and pay particular attention to the files needing to be sorted on the join fields (i.e. 2nd field for file1 and 1st field for file2), as your posted sample input is.
Upvotes: 1
Reputation: 14979
You can use this awk
,
awk -F'|' 'NR==FNR{a[$2]=$1;next} $1 in a { print a[$1],$1,$2 }' OFS="|" file1 file2 > file3
More clearer way:
awk 'BEGIN{ OFS=FS="|";} NR==FNR{a[$2]=$1;next} $1 in a { print a[$1],$1,$2 }' file1 file2 > file3
As per @Kent suggestion:
If your file2
have more than two columns that you want it in file3
then,
awk 'BEGIN{ OFS=FS="|";} NR==FNR{a[$2]=$1;next} $1 in a { print a[$1],$0 }' file1 file2 > file3
Here,
FS
- Field Separator
OFS
- Output Field Separator
Upvotes: 2