Sting1
Sting1

Reputation: 41

Regex replacement in a custom tag

I have a string that may contain one or more of the following tags:

<CHOICE [some words] [other words]>

I need to replace (C#) all occurrences of this tag as follows:

Example: I like <CHOICE [cars and bikes] [apple and oranges]>
Result: I like cars and bikes
Example: I like <CHOICE [cars and bikes] [apple and oranges]>, I also like <CHOICE [pizza] [pasta]>
Result: I like cars and bikes, I also like pizza

Basically, replace the entire tag with only the string appearing in the first set of brackets.

Looks like capture groups is the way to go but I wasn't able to understand how to make them work.

Any help is appreciated!

EDIT: Regex is not a requirement, I thought it would be the best approach, but I see some comments telling me that it's not needed so any other suggestion will be just as fine. Thanks!

Upvotes: 1

Views: 337

Answers (4)

Sushant Yelpale
Sushant Yelpale

Reputation: 889

Get groups of Matches First, then for each Matched Group replace a first string in [ and ]

MatchCollection matches = Regex.Matches(InputStr, @"<CHOICE(.*?)>");

foreach(Match Item in matches)
{
    MatchCollection matches1 = Regex.Matches(Item.ToString(), @"\[(.+?)]");
    string FirstOccurence = matches1[0].Groups[1].ToString();
    InputStr = InputStr.Replace(Item.ToString(), FirstOccurence);
}

Find the demo

Upvotes: 1

Arthur Grigoryan
Arthur Grigoryan

Reputation: 467

I assume this is the best way to do that.

string text = "This is some dummy text with the choice <    CHOICE     [ white   black green     cyan ] [yellow green]>." +
            " The second choice <CHOICE [pink brown red] [blue cyan]>.";
string pattern = @"<\s*?CHOICE\s*\[\s*?(.+?)\s*?\].*?>";
var result = Regex.Replace(text, pattern, r => String.Join(" and ", r.Groups[1].Value.Split(' ', StringSplitOptions.RemoveEmptyEntries)));
Console.WriteLine(result);

Output

This is some dummy text with the choice white and black and green and cyan. The second choice pink and brown and red.

Upvotes: 0

Simon
Simon

Reputation: 667

string pattern = @"\< *CHOICE *((\[(?<choice>[a-zA-Z0-9 ]+)\]) *)+ *>";

Regex regex = new Regex(pattern);

string source = "I like <CHOICE [cars and bikes] [apple and oranges]>";

var match = regex.Match(source);
if (match.Success)
{
    for (int i = 0; i < match.Groups["choice"].Captures.Count; i++)
    {

        Debug.WriteLine(match.Groups["choice"].Captures[i]);
    }
    string replaced = regex.Replace(source, match.Groups["choice"].Captures[0].Value);

    Debug.WriteLine(replaced);
}

The output is:
cars and bikes
apple and oranges
I like cars and bikes

\< *CHOICE *

matches "<" "zero or more spaces" "CHOICE" "zero or more spaces"

([a-zA-Z0-9 ]+)

matches words and spaces

?<choice>

gives above group a name:choice

\[(?<choice>[a-zA-Z0-9 ]+)\]

matches one choice in []

((\[(?<choice>[a-zA-Z0-9 ]+)\] *)

matches choices separated by zero or more spaces

+

means you should have at lease one choice

*>

you can have zero or more spaces at the end before ">"

Upvotes: 0

TheGeneral
TheGeneral

Reputation: 81493

Just for fun. Here is a school-yard foreach state-machine, with a linear O(n) time complexity.

var line = "I like <CHOICE [cars and bikes] [apple and oranges]>";

var result = new StringBuilder();
var state = 0;

foreach (char c in line)
{
   if (state == 0 && c == '<') state = 1;
   else if (state == 1 && c == '[') state = 2;
   else if (state == 2 && c == ']') state = 3;
   else if (state == 3 && c == '>') state = 0;
   else if (state == 0 || state == 2) result.Append(c);
};

Output

I like cars and bikes

Demo here

Upvotes: 3

Related Questions