Reputation: 11
I'm a bit of a noob with this so forgive my ignorance if I am not clear enough. I have two files. File1 looks like below.
file1
AA--:Aa-to-Bb:VGG sometext 1223344 10000 sometext
AB--:Aa-to-Bb:VGG sometext 1223344 12000 sometext
CC--:Dd-to-Ee:VGG sometext 1223344 44000 sometext
AA--:Zz-to-Yy:VGG sometext 1223344 51000 sometext
DD--:Aa-to-Dd:VGG sometext 1223344 36000 sometext
File2 looks similar but there are differences in columns 1 and 4.
file2
CC--:Dd-to-Ee:VGG6645 sometext 1223344 3000 sometext
AA--:Zz-to-Yy:VGG2244 sometext 1223344 1000 sometext
AA--:Aa-to-Bb:VGG12345 sometext 1223344 12200 sometext
AB--:Aa-to-Bb:VGG76523 sometext 1223344 33300 sometext
EE--:Dd-to-Ee:VGG45123 sometext 1223344 76500 sometext
DD--:Aa-to-Dd:VGG3486 sometext 1223344 400 sometext
AA--:Dd-to-Ee:VGG4512 sometext 1223344 22300 sometext
AA--:Zz-to-Dd:VGG98765 sometext 1223344 7000 sometext
CC--:Dd-to-Zz:VGG43576 sometext 1223344 900 sometext
FF--:Zz-to-Yy:VGG5645 sometext 1223344 91200 sometext
CC--:Zz-to-Ee:VGG23456 sometext 1223344 3400 sometext
AA--:Ff-to-Yy:VGG1111 sometext 1223344 51000 sometext
I am trying to do some auto-processing via a bash script. One of the tasks is to compare the two files. If an entry in file1 column 1 matches an entry in file2 column 1 up to and including the "VGG", then the value on column 4 in file1 replaces the value in column 4 in file2.
new file2
CC--:Dd-to-Ee:VGG6645 sometext 1223344 44000 sometext
AA--:Zz-to-Yy:VGG2244 sometext 1223344 51000 sometext
AA--:Aa-to-Bb:VGG12345 sometext 1223344 10000 sometext
AB--:Aa-to-Bb:VGG76523 sometext 1223344 12000 sometext
EE--:Dd-to-Ee:VGG45123 sometext 1223344 76500 sometext
DD--:Aa-to-Dd:VGG3486 sometext 1223344 36000 sometext
AA--:Dd-to-Ee:VGG4512 sometext 1223344 22300 sometext
AA--:Zz-to-Dd:VGG98765 sometext 1223344 7000 sometext
CC--:Dd-to-Zz:VGG43576 sometext 1223344 900 sometext
FF--:Zz-to-Yy:VGG5645 sometext 1223344 91200 sometext
CC--:Zz-to-Ee:VGG23456 sometext 1223344 3400 sometext
AA--:Ff-to-Yy:VGG1111 sometext 1223344 51000 sometext
I have tried using various awk and sed commands, as well as grep, sort, and others and I cannot get the results I am looking for. Any help or suggestions you have would be greatly appreciated.
Upvotes: 0
Views: 462
Reputation: 58371
This might work for you (GNU sed):
sed -r 's|^(\S+)(\s(\S+)){3}\s.*|/^\1/s/\\S+/\3/4|' file2 | sed -rf - -i file1
This makes a sed script from file2 and then runs it against file1.
The sed script generated from file2, matches a line in the object file using the first column value and then replaces the fourth column in the object file with the value in the its fourth column.
Upvotes: 0
Reputation: 21955
Below awk
may help
$ awk 'NR==FNR{rec[$1]=$4;next}{temp=$1;sub(/VGG[0-9]+$/,"VGG",temp)} temp in rec{$4=rec[temp]}1' file1 file2
Output
CC--:Dd-to-Ee:VGG6645 sometext 1223344 44000 sometext
AA--:Zz-to-Yy:VGG2244 sometext 1223344 51000 sometext
AA--:Aa-to-Bb:VGG12345 sometext 1223344 10000 sometext
AB--:Aa-to-Bb:VGG76523 sometext 1223344 12000 sometext
EE--:Dd-to-Ee:VGG45123 sometext 1223344 76500 sometext
DD--:Aa-to-Dd:VGG3486 sometext 1223344 36000 sometext
AA--:Dd-to-Ee:VGG4512 sometext 1223344 22300 sometext
AA--:Zz-to-Dd:VGG98765 sometext 1223344 7000 sometext
CC--:Dd-to-Zz:VGG43576 sometext 1223344 900 sometext
FF--:Zz-to-Yy:VGG5645 sometext 1223344 91200 sometext
CC--:Zz-to-Ee:VGG23456 sometext 1223344 3400 sometext
AA--:Ff-to-Yy:VGG1111 sometext 1223344 51000 sometext
If you need to rewrite file2
$ awk 'NR==FNR{rec[$1]=$4;next}
{temp=$1;sub(/VGG[0-9]+$/,"VGG",temp)}
temp in rec{$4=rec[temp]}1' file1 file2 >filetmp && mv filetmp file2
Upvotes: 1