Reputation: 446
I have a large tab delimited file which looks like:
rhaB: IENJKMAH_01395 MACAJNEK_00455 OLCKBDOH_04002 PMOGBMCF_03363 ANGDFNGL_03589
exuT_1: OLCKBDOH_00247 EHNKCCHC_00463 MACAJNEK_00987 PMOGBMCF_00492 LPCGNNBB_01394
recA: OLCKBDOH_01231 MOEFEGAP_03152 JFGDENGL_01411 DNGGHEME_03701 KFALDAGO_00482
lldP: OLCKBDOH_02876 EHNKCCHC_01431 HHOCJGFI_02180 MACAJNEK_01950 KDLNNIOI_00263
I want to add the text from the first column to the end of the contents in each column, so that the output looks like
rhaB: IENJKMAH_01395_rhaB MACAJNEK_00455_rhaB OLCKBDOH_04002_rhaB PMOGBMCF_03363_rhaB ANGDFNGL_03589_rhaB
The reason for this is I have to ultimately delete the first column and I want to be able to backtrack these ids.
Upvotes: 1
Views: 446
Reputation: 92884
awk approach:
awk '{suffix=substr($1,1,length($1)-1); for(i=2;i<=NF;i++) $i=$i"_"suffix}1' file
The output:
rhaB: IENJKMAH_01395_rhaB MACAJNEK_00455_rhaB OLCKBDOH_04002_rhaB PMOGBMCF_03363_rhaB ANGDFNGL_03589_rhaB
exuT_1: OLCKBDOH_00247_exuT_1 EHNKCCHC_00463_exuT_1 MACAJNEK_00987_exuT_1 PMOGBMCF_00492_exuT_1 LPCGNNBB_01394_exuT_1
recA: OLCKBDOH_01231_recA MOEFEGAP_03152_recA JFGDENGL_01411_recA DNGGHEME_03701_recA KFALDAGO_00482_recA
lldP: OLCKBDOH_02876_lldP EHNKCCHC_01431_lldP HHOCJGFI_02180_lldP MACAJNEK_01950_lldP KDLNNIOI_00263_lldP
suffix=substr($1,1,length($1)-1)
- get the 1st column value without trailing :
for(i=2;i<=NF;i++) $i=$i"_"suffix
- adding suffix value to each next column
To get a "beautified" column output you may pipe with column -tx
:
awk '{suffix=substr($1,1,length($1)-1); for(i=2;i<=NF;i++) $i=$i"_"suffix}1' file | column -tx
The output:
rhaB: IENJKMAH_01395_rhaB MACAJNEK_00455_rhaB OLCKBDOH_04002_rhaB PMOGBMCF_03363_rhaB ANGDFNGL_03589_rhaB
exuT_1: OLCKBDOH_00247_exuT_1 EHNKCCHC_00463_exuT_1 MACAJNEK_00987_exuT_1 PMOGBMCF_00492_exuT_1 LPCGNNBB_01394_exuT_1
recA: OLCKBDOH_01231_recA MOEFEGAP_03152_recA JFGDENGL_01411_recA DNGGHEME_03701_recA KFALDAGO_00482_recA
lldP: OLCKBDOH_02876_lldP EHNKCCHC_01431_lldP HHOCJGFI_02180_lldP MACAJNEK_01950_lldP KDLNNIOI_00263_lldP
Upvotes: 1