Reputation: 33
I want to replace the second column of my first file
file 1:
2 rs58086319 0 983550 T C
2 rs56809628 0 983571 T C
2 rs7608441 0 983572 A G
2 rs114910509 0 983579 A G
2 var_chr2_983614 0 983614 T C
2 var_chr2_983624 0 983624 A G
2 rs115188027 0 983632 A C
2 var_chr2_983636 0 983636 T C
2 var_chr2_983650 0 983650 A G
2 var_chr2_983660 0 983660 T C
with the first column of my second file
file 2:
2_983550_T_C
2_983571_T_C
2_983572_A_G
2_983579_A_G
2_983614_T_C
2_983624_A_G
2_983632_A_C
2_983636_T_C
2_983650_A_G
2_983660_T_C
I've tried join and awk but somehow it doesn't seem to work. I suspect the fact that there's '_' on my second file.
Thank you
Upvotes: 1
Views: 87
Reputation: 47189
I would go with paste
and awk
, e.g.:
paste file1 file2 | awk '{ $2 = $NF } NF--' OFS='\t'
Output:
2 2_983550_T_C 0 983550 T C
2 2_983571_T_C 0 983571 T C
2 2_983572_A_G 0 983572 A G
2 2_983579_A_G 0 983579 A G
2 2_983614_T_C 0 983614 T C
2 2_983624_A_G 0 983624 A G
2 2_983632_A_C 0 983632 A C
2 2_983636_T_C 0 983636 T C
2 2_983650_A_G 0 983650 A G
2 2_983660_T_C 0 983660 T C
Upvotes: 0
Reputation: 26531
I'm a bit puzzled why you need a second file. All information of file2
seems to be encoded in file1
. You could just do something like this :
awk '{$2=$1"_"$4"_"$5"_"$6}1' file1
Upvotes: 2
Reputation: 2491
Your file2 have only one column so with awk.
awk -v f='file2' '{getline $2 <f}1' file1
If the separator of file2 is "_"
awk -v f='file2' '{getline a <f;split(a,b,"_");$2=b[1]}1' file1
Upvotes: 1
Reputation: 133700
EDIT: In case you want to make _
as field separator in Input_file2 then following may help you.
awk 'FNR==NR{a[FNR]=$1;next} (FNR in a){$2=a[FNR]} 1' FS="_" file2 FS=" " file1 | column -t
Following awk
may help you here.
awk 'FNR==NR{a[FNR]=$0;next} (FNR in a){$2=a[FNR]} 1' file2 file1 | column -t
Upvotes: 1