Reputation: 3022
I am trying to use awk
to parse multiple conditions and having some trouble with the first. I think the code below is close, but it does not return the desired output. The parse rules are: Thank you :).
t
awk -F"[_.>]" 'FNR>1 {X=$4+0; sub(X, "", $4); print $2+0, X, X, $4, $5}' OFS="\t" ${id}_position.txt > ${id}_parse.txt
id_position.txt
Input Variant Errors Chromosomal Variant Coding Variant(s)
NM_004004.5:c.79G>A NC_000013.10:g.20763642C>T NM_004004.5:c.79G>A XM_005266354.1:c.79G>A XM_005266355.1:c.79G>A XM_005266356.1:c.79G>A
Desired output:
13 20763642 20763642 C T
Upvotes: 0
Views: 85
Reputation: 41460
This should do:
awk 'NR==2 {split($2,a,"[_.>]");b=substr(a[4],1,length(a[4]-1));print a[2]+0,b,b,substr(a[4],length(a[4])),a[5]}' OFS="\t" file
13 20763642 20763642 C T
Upvotes: 0