chas
chas

Reputation: 1645

replace zero with text using sed or awk

I have text file which looks like as shown below:

 0  chr23:54039     0   54039
 0  chr23:103278    0   103278
 0  chr22:174609    0   174609
 0  chr22:54039     0   54039
 0  chr25:103278    0   103278
 0  chr25:174609    0   174609
 26 chr26:174609    0   174609

If the first column is '0' i need to replace the 0 in the first column with the number after chr. So, the output should look like:

23  chr23:54039     0   54039
23  chr23:103278    0   103278
22  chr22:174609    0   174609
22  chr22:54039     0   54039
25  chr25:103278    0   103278
25  chr25:174609    0   174609
26  chr26:174609    0   174609

Can anyone provide a simple sed or awk any linux solution?

Upvotes: 1

Views: 439

Answers (3)

NeronLeVelu
NeronLeVelu

Reputation: 10039

sed "s/^0[[:blank:]]\{1,\}chr\([0-9]\{1,\}\):/\1 chr\1:/"

Upvotes: 0

Jotne
Jotne

Reputation: 41456

If number in column #1 is always the same as chr number you can do this with awk

awk '{split($2,a,":|chr");$1=a[2]}1' file
23 chr23:54039 0 54039
23 chr23:103278 0 103278
22 chr22:174609 0 174609
22 chr22:54039 0 54039
25 chr25:103278 0 103278
25 chr25:174609 0 174609
26 chr26:174609 0 174609

Upvotes: 6

fedorqui
fedorqui

Reputation: 289545

With sed:

$ sed -r '/^0/s/0(\s*chr)([^:]*)/\2\1\2/g' file
23  chr23:54039     0   54039
23  chr23:103278    0   103278
22  chr22:174609    0   174609
22  chr22:54039     0   54039
25  chr25:103278    0   103278
25  chr25:174609    0   174609
26 chr26:174609    0   174609

Without -r:

$ sed '/^0/s/0\(\s*chr\)\([^:]*\)/\2\1\2/g' file
23  chr23:54039     0   54039
23  chr23:103278    0   103278
22  chr22:174609    0   174609
22  chr22:54039     0   54039
25  chr25:103278    0   103278
25  chr25:174609    0   174609
26 chr26:174609    0   174609

The idea is to replace lines starting with 0. In those, the 0...chrNUM:... is caught and printed back with desired format.

With awk:

$ awk '/^0/ {split($2,a,":"); gsub("chr", "", a[1]); $1=a[1]}1' file
23 chr23:54039 0 54039
23 chr23:103278 0 103278
22 chr22:174609 0 174609
22 chr22:54039 0 54039
25 chr25:103278 0 103278
25 chr25:174609 0 174609
26 chr26:174609    0   174609

Given lines starting with 0, the 2nd field is broken into pieces by : delimiter and then chr text is removes. Then it is ready to be stored as first field. 1 makes the condition true, so the full new line is printed.

Upvotes: 3

Related Questions