Doc
Doc

Reputation: 5266

Regex recursive substitutions

I have 3 case of data:

{{test_data}}
{{!test_data}}
{{test_data1&&!test_data2}} // test_data2 might not have the !

and I need to translate those strings with:

mystring.test_data
!mystring.test_data
mystring.test_data1 && !mystring.test_data2

I'm fiddling around with the super-useful regex101.com and i managed to cover almost all of 3 cases with Regex.Replace(str, "{{2}(?:(!?)(\w*)(\|{2}|&{2})?)}{2}", "$1mystring.$2 $3");

I can't figure out how to use regex recursion to re-apply the (?: ) part until the }} and join together all the matches using the specified substitution pattern

Is that even possible??


edit: here's the regex101 page -> https://regex101.com/r/vIBVkQ/2

Upvotes: 3

Views: 2989

Answers (3)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

I would advise to use a more generic solution here, with smaller, easier to read and maintain regexps here: one (the longest) will be used to find the substrings you need (the longest one), then a simple \w+ pattern will be used to add the my_string. part and the other will add spaces around logical operators. The smaller regexps will be used inside a match evaluator, to manipulate the values found by the longest regex:

Regex.Replace(input, @"{{!?\w+(?:\s*(?:&&|\|\|)\s*!?\w+)*}}", m =>
    Regex.Replace(
        Regex.Replace(m.Value, @"\s*(&&|\|\|)\s*", " $1 "),
         @"\w+",
         "mystring.$&"
    )
)

See the C# demo

The main regex matches:

  • {{ - a {{ substring
  • !? - an optional ! sign
  • \w+ - 1 or more word chars
  • (?:\s*(?:&&|\|\|)\s*!?\w+)* - 0+ sequences of:
    • \s* - 0+ whitespace chars
    • (?:&&|\|\|) - a && or || substring
    • \s* - 0+ whitespaces
    • !? - an optional !
    • \w+ - 1 or more word chars
  • }} - a }} substring.

Upvotes: 1

grek40
grek40

Reputation: 13438

I don't think you can use recursion, but with a different representation of your input pattern, you can use sub-groups. Note I used named captures to slightly limit the confusion in this example:

var test = @"{{test_data}}
{{!test_data}}
{{test_data1&&!test_data2&&test_data3}}
{{test_data1&&!test_data2 fail test_data3}}
{{test_data1&&test_data2||!test_data3}}";

// (1:!)(2:word)(3:||&&)(4:repeat)
var matches = Regex.Matches(test, @"\{{2}(?:(?<exc>!?)(?<word>\w+))(?:(?<op>\|{2}|&{2})(?<exc2>!?)(?<word2>\w+))*}{2}");

foreach (Match match in matches)
{
    Console.WriteLine("Match: {0}", match.Value);
    Console.WriteLine("  exc: {0}", match.Groups["exc"].Value);
    Console.WriteLine(" word: {0}", match.Groups["word"].Value);
    for (int i = 0; i < match.Groups["op"].Captures.Count; i++)
    {
        Console.WriteLine("   op: {0}", match.Groups["op"].Captures[i].Value);
        Console.WriteLine(" exc2: {0}", match.Groups["exc2"].Captures[i].Value);
        Console.WriteLine("word2: {0}", match.Groups["word2"].Captures[i].Value);
    }
}

The idea is to read the first word in each group unconditionally and then possibly read N combinations of (|| or &&)(optional !)(word) as separate groups with sub-captures.

Example output:

Match: {{test_data}}
  exc:
 word: test_data
Match: {{!test_data}}
  exc: !
 word: test_data
Match: {{test_data1&&!test_data2&&test_data3}}
  exc:
 word: test_data1
   op: &&
 exc2: !
word2: test_data2
   op: &&
 exc2:
word2: test_data3
Match: {{test_data1&&test_data2||!test_data3}}
  exc:
 word: test_data1
   op: &&
 exc2:
word2: test_data2
   op: ||
 exc2: !
word2: test_data3

Note the line {{test_data1&&!test_data2 fail test_data3}} is not part of the result groups because it doesn't comply with the syntax rules.

So you can build your desired result the same way from the matches structure:

foreach (Match match in matches)
{
    var sb = new StringBuilder();
    sb.Append(match.Groups["exc"].Value).Append("mystring.").Append(match.Groups["word"].Value);

    for (int i = 0; i < match.Groups["op"].Captures.Count; i++)
    {
        sb.Append(' ').Append(match.Groups["op"].Captures[i].Value).Append(' ');
        sb.Append(match.Groups["exc2"].Value).Append("mystring.").Append(match.Groups["word2"].Value);
    }
    Console.WriteLine("Result: {0}", sb.ToString());
}

Upvotes: 0

Srdjan M.
Srdjan M.

Reputation: 3405

Regex: (?:{{2}|[^|]{2}|[^&]{2})\!?(\w+)(?:}{2})?

Regex demo

C# code:

List<string> list = new List<string>() { "{{test_data}}", "{{!test_data}}", "{{test_data1&&!test_data2}}" };

foreach(string s in list)
{
    string t = Regex.Replace(s, @"(?:{{2}|[^|]{2}|[^&]{2})\!?(\w+)(?:}{2})?",
           o => o.Value.Contains("!") ? "!mystring." + o.Groups[1].Value : "mystring." + o.Groups[1].Value);

    Console.WriteLine(t);
}
Console.ReadLine();

Output:

mystring.test_data
!mystring.test_data
mystring.test_data1&&!mystring.test_data2

Upvotes: 0

Related Questions