kernash
kernash

Reputation: 25

Why does sed not process all occurrences?

Or: How do I get sed to replace all occurrences?

$ echo ":::::::" |sed -e 's/::/:0:/g'
:0::0::0::

I would expect/want the output to be :0:0:0:0:0:0:

Upvotes: 1

Views: 80

Answers (4)

F. Hauri  - Give Up GitHub
F. Hauri - Give Up GitHub

Reputation: 70947

Because once :: seen, next :: pattern doesn't contain last : of 1st pair!

Your string is parsed as :: replaced by :0:, then ::, :: and :. So there are 3 :: followed by :. This could be replaced by :0::0::0::. The operation seem correct!

So operation have to be done two time to ensure all :: pair will be modified.

first

As requested, I answer by using first and because sed is the more efficient tool for this operation on big streamed files. See further, more efficient for variables and small files.

hardcoding two steps

Simply:

echo ":::::::" | sed -e 's/::/:0:/g;s/::/:0:/g'
:0:0:0:0:0:0:

Like many programmer (publishing here) this solution was not the first I was thinked about. Anyway, for this problem, hardcoding two step is the more efficient solution!

loop now

Because I hate to do hardcoding, the first solution I found was to do a loop until all :: pair will be changed:

So we have to add a branch condition:

echo ":::::::" |sed -e ':a;s/::/:0:/;ta'
:0:0:0:0:0:0:

t statment branch to :a label if last s/// operation does some modification.

So sed will loop over whole line while :: will be found.

Note I've supressed g switch, but this will make more loop over line. Compare (I use lower l command to dump current line. see info sed for more informations regarding this):

echo ":::::::" |sed -e ':a;s/::/:0:/;l;ta'
:0::::::$
:0:0:::::$
:0:0:0::::$
:0:0:0:0:::$
:0:0:0:0:0::$
:0:0:0:0:0:0:$
:0:0:0:0:0:0:$
:0:0:0:0:0:0:

with 6 loop (1 by 0 added), and

echo ":::::::" |sed -e ':a;s/::/:0:/g;l;ta'
:0::0::0::$
:0:0:0:0:0:0:$
:0:0:0:0:0:0:$
:0:0:0:0:0:0:

where 2 loop will do the whole job.

now.

Of course this could be done by using a lot of different tools like , , , , ...

But this could be done in a very efficient way, by using pure bash:

var=':::::::'
echo ${var//::/:0:}
:0::0::0::

With the same problem!

hardcoded

var=':::::::'
var=${var//::/:0:}
var=${var//::/:0:}
echo $var
:0:0:0:0:0:0:

In a read loop:

echo >testfile ':::::::'
while read -r var ;do
    var=${var//::/:0:}
    echo "${var//::/:0:}"
done <testfile
:0:0:0:0:0:0:

loop

So by doing a loop, this could look like:

var=':::::::'
while [ "$var" != "${var//::/:0:}" ] ;do
    var="${var//::/:0:}"
done
echo "$var"
:0:0:0:0:0:0:

As this don't have to fork to another binary, this could be quicker.

And if doing translation two time seem too much, you could write this:

var=':::::::'
while tmpvar="${var//::/:0:}" ; [ "$var" != "$tmpvar" ] ;do
    var="$tmpvar"
done
echo $var
:0:0:0:0:0:0:

Show steps:

var=':::::::'
while
  tmpvar="${var//::/:0:}"
  [ "$var" != "$tmpvar" ]
  do
    var="$tmpvar"
    printf 'Step: \047%s\047\n' "$tmpvar"
done
Step: ':0::0::0::'
Step: ':0:0:0:0:0:0:'

echo $var
:0:0:0:0:0:0:

Or if you drop 1st / to do only 1 modification by step:

var=':::::::'
while
  tmpvar="${var/::/:0:}"
  [ "$var" != "$tmpvar" ]
  do
    var="$tmpvar"
    printf 'Step: \047%s\047\n' "$tmpvar"
done
Step: ':0::::::'
Step: ':0:0:::::'
Step: ':0:0:0::::'
Step: ':0:0:0:0:::'
Step: ':0:0:0:0:0::'
Step: ':0:0:0:0:0:0:'

But this become slower!

Upvotes: 1

anubhava
anubhava

Reputation: 785651

You may use a label and jump instruction in gnu-sed:

echo ":::::::" | sed ':a; s/::/:0:/g; ta'

:0:0:0:0:0:0:

Or using POSIX sed:

echo ":::::::" | sed -e ':a' -e 's/::/:0:/g; ta'

Or using awk:

awk 'BEGIN {FS=OFS=":"} {for (i=2; i<NF; ++i) $i="0"} 1' <<< ':::::::'

:0:0:0:0:0:0:

Upvotes: 2

RavinderSingh13
RavinderSingh13

Reputation: 133670

With awk you could try following. Written and tested with shown samples. Simply globally substitute all occurrences of : :0 once its done then substitute last occurrence of 0 with null.

echo ":::::::" | awk '{gsub(/:/,"&0");sub(/0$/,"")} 1'

Upvotes: 1

choroba
choroba

Reputation: 242038

When sed replaces :: with :0:, the next match doesn't start with the final : of the previous match, but with the following colon.

You can use Perl's look-ahead assertion:

echo ":::::::" | perl -pe 's/:(?=:)/:0/g'

(?=:) means "the following character is a colon, but it's not part of the match". Therefore, the next match can start with the following colon.

Upvotes: 2

Related Questions