Reputation: 45
I would like to substitute every commas happen to be between two specific strings (DP & MQ) to semicolons.
input
0,0,0,DP=1,1,1,1,MQ=2,2,2
expected output
0,0,0,DP=1;1;1;1;MQ=2,2,2
I have variable number of fields before and after the DP & MQ so I thought sed would be the best. I don't want to substitute commas before DP or after MQ. Could any of you please help me I know that it should looks like this
sed 's/DP=.,.,.,.,MQ/DP=somethingMQ/g'
Thanks in advance
Upvotes: 0
Views: 143
Reputation: 67211
below code will do:
awk -F"=" '{OFS="=";gsub(",",";",$2)}1'
tested:
> echo "0,0,0,DP=1,1,1,1,MQ=2,2,2" | awk -F"=" '{OFS="=";gsub(",",";",$2)}1'
0,0,0,DP=1;1;1;1;MQ=2,2,2
or you can use:
perl -plne '$_=~/DP=(.*)MQ/;$a=$1;$a=~s/,/;/g;$_=~s/(.*DP=).*(MQ.*$)/$1$a$2/g'
Tested:
> echo "0,0,0,DP=1,1,1,1,MQ=2,2,2" | perl -plne '$_=~/DP=(.*)MQ/;$a=$1;$a=~s/,/;/g;$_=~s/(.*DP=).*(MQ.*$)/$1$a$2/g'
0,0,0,DP=1;1;1;1;MQ=2,2,2
or
perl -F"=" -ane '$F[1]=~s/,/;/g;print join "=",@F'
tested:
> echo "0,0,0,DP=1,1,1,1,MQ=2,2,2" | perl -F"=" -ane '$F[1]=~s/,/;/g;print join "=",@F'
0,0,0,DP=1;1;1;1;MQ=2,2,2
Upvotes: 0
Reputation: 31548
With awk , you can do like this(provided there are no more = there)
awk -F"=" '{gsub(",",";",$2); $1 = $1; print $1"="$2"="$3}' temp.txt
output
0,0,0,DP=1;1;1;1;MQ=2,2,2
Upvotes: 0
Reputation: 58351
This might work for you (GNU sed):
sed -r 's/DP.*MQ/\n&\n/;h;y/,/;/;G;s/.*\n(.*)\n.*\n(.*)\n.*\n/\2\1/' file
This sed idiom, marks the string in question (using newlines), copies the marked line, alters the string and then combines the original line with the altered string.
The marking of the string may have to be more specific i.e.:
sed -r 's/DP=[^=]*MQ=/\n&\n/;h;y/,/;/;G;s/.*\n(.*)\n.*\n(.*)\n.*\n/\2\1/' file
If only some of file may contain the string in question use:
sed -r '/DP=[^=]*MQ=/{s//\n&\n/;h;y/,/;/;G;s/.*\n(.*)\n.*\n(.*)\n.*\n/\2\1/}' file
Upvotes: 2
Reputation: 195029
if you have gnu sed: this should work with your example:
sed -r 's/(.*DP=)(.*)(MQ=.*)/echo -n \1;echo -n \2 \|tr "," ";"; echo -n \3/ge' input
test with your example
kent$ sed -r 's/(.*DP=)(.*)(MQ=.*)/echo -n \1;echo -n \2 \|tr "," ";"; echo -n \3/ge' <<<"0,0,0,DP=1,1,1,1,MQ=2,2,2"
0,0,0,DP=1;1;1;1;MQ=2,2,2
Upvotes: 0