Vinothkumar K
Vinothkumar K

Reputation: 11

Regex pattern for finding repeated character of patterns

string

zf3kabxcde224lkzf3mabxc51+crsdtzf3nab=

with a specified pattern length of 3, the method should return the pattern abx with an occurrence value of two, and zf3 with an occurrence value of three.

Upvotes: 0

Views: 150

Answers (4)

Vinothkumar K
Vinothkumar K

Reputation: 11

        var content = "zf3kabxcde224lkzf3mabxc51+crsdtzf3nab=";
        var patternLength = 3;            
        var patterns = new HashSet<string>();

        for (int i = 0; i < content.Length - patternLength + 1; i++)
        {
            var pattern = content.Substring(i, patternLength);                
            var Occurrence = Regex.Matches(content, pattern.Replace("+", @"\+")).Count;
            if (Occurrence > 1 && !patterns.Contains(pattern))
            {
                Console.WriteLine(pattern + " : " + Occurrence);
                patterns.Add(pattern);
            }
        }

Upvotes: 0

Badwi Finianos
Badwi Finianos

Reputation: 107

Simplest solution:

   var myString = "zf3kabxcde224lkzf3mabxc51+crsdtzf3nab=";
        var length = 3;

        for (int i = 0; i < myString.Length - length + 1; i++)
        {
               var Pattern = myString.Substring(i, length).Replace("+",".+").Replace("*", ".*").Replace("?", ".?");
            var Occurrence = Regex.Matches(myString, Pattern).Count;

            Console.WriteLine(Pattern + " : " + Occurrence);
        }

Upvotes: 0

Dmitrii Bychenko
Dmitrii Bychenko

Reputation: 186668

I suggest using Linq instead of regular expressions, e.g.:

  string source = @"zf3kabxcde224lkzf3mabxc51+crsdtzf3nab=";

  int size = 3;

  var result = Enumerable
    .Range(0, source.Length - size + 1)
    .GroupBy(i => source.Substring(i, size))
    .Where(chunk => chunk.Count() > 1)
    .Select(chunk => $"'{chunk.Key}' appears {chunk.Count()} times");

 Console.Write(string.Join(Environment.NewLine, result));

Outcome:

'zf3' appears 3 times
'abx' appears 2 times
'bxc' appears 2 times

Please, note, that we have in fact two different chunks (abx and bxc) which appear twice.

Linq is very flexible, so you can easily make a query in a different way, e.g.

 var result = Enumerable
    .Range(0, source.Length - size + 1)
    .GroupBy(i => source.Substring(i, size))
    .Where(chunk => chunk.Count() > 1)
    .GroupBy(chunk => chunk.Count(), chunk => chunk.Key)
    .OrderBy(chunk => chunk.Key)
    .Select(chunk => $"Appears: {chunk.Key}; patterns: {string.Join(", ", chunk)}");

 Console.Write(string.Join(Environment.NewLine, result));

Outcome:

 Appears: 2; patterns: abx, bxc
 Appears: 3; patterns: zf3

Upvotes: 3

Michał Turczyn
Michał Turczyn

Reputation: 37367

I think it's not good task for regex, I would use dictionary with splitting input string to strinns of specified length:

var length = 3;
var str = "zf3kabxcde224lkzf3mabxc51+crsdtzf3nab=";
var occurences = new Dictionary<string, int>();
for (int i = 0; i < str.Length - length + 1; i++)
{
    var s = str.Substring(i, length);
    if (occurences.ContainsKey(s))
      occurences[s] += 1;
    else
      occurences.Add(s, 1);
}

Now you can check how many occurences has any string of length 3, eg.: occurences["zf3"] equals 3.

Upvotes: 2

Related Questions