Reputation: 1645
I have text file which looks like as shown below:
0 chr23:54039 0 54039
0 chr23:103278 0 103278
0 chr22:174609 0 174609
0 chr22:54039 0 54039
0 chr25:103278 0 103278
0 chr25:174609 0 174609
26 chr26:174609 0 174609
If the first column is '0' i need to replace the 0 in the first column with the number after chr. So, the output should look like:
23 chr23:54039 0 54039
23 chr23:103278 0 103278
22 chr22:174609 0 174609
22 chr22:54039 0 54039
25 chr25:103278 0 103278
25 chr25:174609 0 174609
26 chr26:174609 0 174609
Can anyone provide a simple sed or awk any linux solution?
Upvotes: 1
Views: 439
Reputation: 41456
If number in column #1 is always the same as chr
number you can do this with awk
awk '{split($2,a,":|chr");$1=a[2]}1' file
23 chr23:54039 0 54039
23 chr23:103278 0 103278
22 chr22:174609 0 174609
22 chr22:54039 0 54039
25 chr25:103278 0 103278
25 chr25:174609 0 174609
26 chr26:174609 0 174609
Upvotes: 6
Reputation: 289545
With sed:
$ sed -r '/^0/s/0(\s*chr)([^:]*)/\2\1\2/g' file
23 chr23:54039 0 54039
23 chr23:103278 0 103278
22 chr22:174609 0 174609
22 chr22:54039 0 54039
25 chr25:103278 0 103278
25 chr25:174609 0 174609
26 chr26:174609 0 174609
Without -r
:
$ sed '/^0/s/0\(\s*chr\)\([^:]*\)/\2\1\2/g' file
23 chr23:54039 0 54039
23 chr23:103278 0 103278
22 chr22:174609 0 174609
22 chr22:54039 0 54039
25 chr25:103278 0 103278
25 chr25:174609 0 174609
26 chr26:174609 0 174609
The idea is to replace lines starting with 0
. In those, the 0...chrNUM:...
is caught and printed back with desired format.
With awk
:
$ awk '/^0/ {split($2,a,":"); gsub("chr", "", a[1]); $1=a[1]}1' file
23 chr23:54039 0 54039
23 chr23:103278 0 103278
22 chr22:174609 0 174609
22 chr22:54039 0 54039
25 chr25:103278 0 103278
25 chr25:174609 0 174609
26 chr26:174609 0 174609
Given lines starting with 0
, the 2nd field is broken into pieces by :
delimiter and then chr
text is removes. Then it is ready to be stored as first field. 1
makes the condition true, so the full new line is printed.
Upvotes: 3