Mike Gates
Mike Gates

Reputation: 1904

Replacing a sequence of characters with regular expressions

If I have the string: ababa

and I want to replace any "aba" sequence with "c". How do I do this? The regular expression "aba" to be replaced by "c" doesn't work as this comes out as "cba". I want this to come out as "cc". I'm guessing this is because the second "a" in the input string is being consumed by the first match. Any idea how to do this?

Upvotes: 2

Views: 2957

Answers (6)

Antony Hatchkins
Antony Hatchkins

Reputation: 33994

One pass!

s/ab(?=aba)|aba/c/g;

This in fact is the solution!

aba -> cc
ababa -> ccc
abazaba -> czc

Upvotes: 3

ennuikiller
ennuikiller

Reputation: 46965

Its gotta be multi-pass something like:

s/ab(?=a)/c/g followed by s/a//g

Also you can in perl play around with the pos match function which will reset the positon of the last match for you (that is you would do something like pos = pos -1). Mastering Perl is a good reference if you want to go down this path.

Upvotes: 0

UncleO
UncleO

Reputation: 8449

Does "c" appear in the original string?

If not, use a loop to repeated replace strings. Replace "aba" by "c", and also replace "cba" by "cc".

edit: If c does appear, is there some character that doesn't appear in the original string? Say, z?

Use a loop to replace "aba" by "z", and also replace "zba" by "zz". When the loop finishes, replace all the "z" with "c".

Upvotes: 0

Paul Creasey
Paul Creasey

Reputation: 28844

I don't think this is possible with a single step, since the first match will invalidate the second. You could achieve it using two steps and lookaround, what tool are you doing this in?

first match the b's which are surrounded by a's

s/(?<=a)~b~(?=a)/b/g

this matches the b's and replaces them, you can then use another step to remove the surrounding a's

s/(~a~|a~|~a)//g

This is an example using perl like syntax, the ~ characters i inserted just to mark the a's which should be removed in the second step.

Upvotes: 0

Amber
Amber

Reputation: 526803

I'm not sure you can do this in a single-pass regex replacement - most regex engines treat replacements as having to deal with non-overlapping matches of a pattern.

Instead, you might want to write some simple code that scans through the string and looks for overlapping occurrences, then replaces runs of occurrences with the appropriate number of repetitions of the replacement, before moving on to the next run.

Upvotes: 0

Antony Hatchkins
Antony Hatchkins

Reputation: 33994

ab(?=a) A zero-width positive lookahead assertion.

Upvotes: 0

Related Questions