roberutsu
roberutsu

Reputation: 363

Using awk to merge two files associating numbers

I have two files, note that I'd like to use the last column like a reference:

1) First file:

[robert@10-2Fontes]$ head rxid 
0.086297     id 0    udp    767     0
0.091866     id 1    udp    760     1
0.097236     id 2    udp    733     2
0.103616     id 3    udp    869     3
0.110956     id 4    udp    1000    4
0.459247     id 9    udp    754     54

Note: this file have 64196 lines.

2) Second File (reference):

[robert@10-2Fontes]$ head pumba.txt 
0.086297 0
0.091866 1
0.097236 2
0.103616 3
0.110956 4
0.118285 5
0.125615 6
0.130077 7
0.459247 54

This file is a index and have 64677 lines,

3) I would like a 3rd file, where it search the correspondent number in file 2 and put the number related in the last column of the first file. Something like that:

0.086297     id 0    udp    767     0 0.086297
0.091866     id 1    udp    760     1 0.091866
0.097236     id 2    udp    733     2 0.097236
0.103616     id 3    udp    869     3 0.103616
0.110956     id 4    udp    1000    4 0.110956
...

Upvotes: 1

Views: 1099

Answers (2)

Mirage
Mirage

Reputation: 31568

Just in case , you can even make the first column as reference as well , then you don't need to have the last column for reference checking. I mean suppose you have

file1:

0.086297     id 0    udp    767     
0.091866     id 1    udp    760     
0.097236     id 2    udp    733     
0.103616     id 3    udp    869     
0.110956     id 4    udp    1000    
0.459247     id 9    udp    754     

and file2:

0.086297 0
0.091866 1
0.097236 2
0.103616 3
0.110956 4
0.118285 5
0.125615 6
0.130077 7
0.459247 54

You can still combine them based on the first column like this

awk 'NR==FNR{a[$1]=$0; next;}$1 in a {print a[$1]" "$2}' file1.txt file2.txt

  1. .NR==FNR{a[$1]=$0; next;} for the first file the First column will be used as index to store the whole line and next portion will be skipped

  2. For the second file if $1 first column exist in array a then we combine the previous saved line with the second column $2

Final Output

0.086297     id 0    udp    767 0
0.091866     id 1    udp    760 1
0.097236     id 2    udp    733 2
0.103616     id 3    udp    869 3
0.110956     id 4    udp    1000 4
0.459247     id 9    udp    754 54

Upvotes: 0

Kent
Kent

Reputation: 195179

how about:

awk 'NR==FNR{a[$2]=$1;next}$6 in a{print $0,a[$6]}' file2 file1 > file3

Upvotes: 2

Related Questions