Reputation: 1068
Is there a logical way to find duplicate substrings in a string that would work regardless of how many times the substring is duplicated; then return the amount of times it was duplicated with the final word.
For example, abc-abc-abc-
Can be broken into | abc- | abc- | abc-
= abc- x3
Example 2, abc-abc-abc-abc-
Can be broken into | abc- | abc- | abc- | abc-
= abc- x4
For even repetitions, it is not a problem to split the string in half and then compare both substrings. Then you can keep looping until the halves do not match.
For odd string length, you can break it into 3rds and compare all three parts and do the same thing.
The problem happens when a word repeats 7 or 11 times. Dividing the length by 4 or 5 could work.
So for example, a string with yesnoyesnoyesnoyesnoyesnoyesnoyesno
has the substring yesno
repeating 7 times.
Is there some sort of a formula, regex, or linq that can transform yesnoyesnoyesnoyesnoyesnoyesnoyesno
to yesno (x7)
?
Upvotes: 0
Views: 1619
Reputation: 7948
this worked for me (..*?)(?=\1(?=\1*$)|$)
demo
or even shorter (..*?)(?=\1*$)
demo
(..*?)
capture at least one character and add as many as needed(?=\1*$)
lookahead for the previously captured results as many times as possible to the end.Upvotes: 4
Reputation: 3188
var list=new string[]{"abc-abc-abc-",
"abc-abc-abc-abc-",
"yesnoyesnoyesnoyesnoyesnoyesnoyesno"};
var reg=new Regex("(.+?)(?=\\1|$)");
foreach(var str in list)
{
string result=string.Format("{0} (x{1})",reg.Match(str).Value, reg.Matches(str).Count);
Console.WriteLine(result);
}
OUTPUT:
abc- (x3)
abc- (x4)
yesno (x7)
Upvotes: 1