Reputation: 2245
There are many posts like this and I have found a few solutions but there are not perfect. One of them:
"aabbhahahahahahahahahahahasetsetset".replace(/[^\w\s]|(.+)\1+/gi, '$1')
The results is:
abhahahahahahaset
I want to get result:
abhaset
How to do this ?
Upvotes: 2
Views: 118
Reputation: 44259
.+
is greedy. It takes as much as it can. That is half of the ha
s so that \1
can match the second half. Making the repetition ungreedy should do the trick:
/[^\w\s]|(.+?)\1+/gi
By the way, the i
doesn't change anything here.
To get rid of nested repetitions (e.g. transform aaBBaaBB
into aB
(via aaBB
or aBaB
)) simply run the replacement multiple times until the result does not change any more.
var pattern = /[^\w\s]|(.+?)\1+/g;
var output = "aaBBaaBB";
var input;
do
{
input = output;
output = input.replace(pattern, "$1");
} while (input != output)
I admit the naming of output
is a bit awkward for the first repetition, but you know... the two most difficult problems in computer science are cache invalidation, naming things and off-by-one errors.
Upvotes: 4
Reputation: 191729
.+
will match the maximum amount possible, so hahahaha
satisfies (.+)\1
with haha
and haha
. You want to match the minimum amount possible, so use a reluctant quantifier.
"aabbhahahahahahahahahahahasetsetset".replace(/[^\w\s]|(.+?)\1+/gi, '$1')
Upvotes: 2