Reputation: 41
I have a string that may contain one or more of the following tags:
<CHOICE [some words] [other words]>
I need to replace (C#) all occurrences of this tag as follows:
Example: I like <CHOICE [cars and bikes] [apple and oranges]>
Result: I like cars and bikes
Example: I like <CHOICE [cars and bikes] [apple and oranges]>, I also like <CHOICE [pizza] [pasta]>
Result: I like cars and bikes, I also like pizza
Basically, replace the entire tag with only the string appearing in the first set of brackets.
Looks like capture groups is the way to go but I wasn't able to understand how to make them work.
Any help is appreciated!
EDIT: Regex is not a requirement, I thought it would be the best approach, but I see some comments telling me that it's not needed so any other suggestion will be just as fine. Thanks!
Upvotes: 1
Views: 337
Reputation: 889
Get groups of Matches First, then for each Matched Group replace a first string in [
and ]
MatchCollection matches = Regex.Matches(InputStr, @"<CHOICE(.*?)>");
foreach(Match Item in matches)
{
MatchCollection matches1 = Regex.Matches(Item.ToString(), @"\[(.+?)]");
string FirstOccurence = matches1[0].Groups[1].ToString();
InputStr = InputStr.Replace(Item.ToString(), FirstOccurence);
}
Find the demo
Upvotes: 1
Reputation: 467
I assume this is the best way to do that.
string text = "This is some dummy text with the choice < CHOICE [ white black green cyan ] [yellow green]>." +
" The second choice <CHOICE [pink brown red] [blue cyan]>.";
string pattern = @"<\s*?CHOICE\s*\[\s*?(.+?)\s*?\].*?>";
var result = Regex.Replace(text, pattern, r => String.Join(" and ", r.Groups[1].Value.Split(' ', StringSplitOptions.RemoveEmptyEntries)));
Console.WriteLine(result);
Output
This is some dummy text with the choice white and black and green and cyan. The second choice pink and brown and red.
Upvotes: 0
Reputation: 667
string pattern = @"\< *CHOICE *((\[(?<choice>[a-zA-Z0-9 ]+)\]) *)+ *>";
Regex regex = new Regex(pattern);
string source = "I like <CHOICE [cars and bikes] [apple and oranges]>";
var match = regex.Match(source);
if (match.Success)
{
for (int i = 0; i < match.Groups["choice"].Captures.Count; i++)
{
Debug.WriteLine(match.Groups["choice"].Captures[i]);
}
string replaced = regex.Replace(source, match.Groups["choice"].Captures[0].Value);
Debug.WriteLine(replaced);
}
The output is:
cars and bikes
apple and oranges
I like cars and bikes
\< *CHOICE *
matches "<" "zero or more spaces" "CHOICE" "zero or more spaces"
([a-zA-Z0-9 ]+)
matches words and spaces
?<choice>
gives above group a name:choice
\[(?<choice>[a-zA-Z0-9 ]+)\]
matches one choice in []
((\[(?<choice>[a-zA-Z0-9 ]+)\] *)
matches choices separated by zero or more spaces
+
means you should have at lease one choice
*>
you can have zero or more spaces at the end before ">"
Upvotes: 0
Reputation: 81493
Just for fun. Here is a school-yard foreach
state-machine, with a linear O(n) time complexity.
var line = "I like <CHOICE [cars and bikes] [apple and oranges]>";
var result = new StringBuilder();
var state = 0;
foreach (char c in line)
{
if (state == 0 && c == '<') state = 1;
else if (state == 1 && c == '[') state = 2;
else if (state == 2 && c == ']') state = 3;
else if (state == 3 && c == '>') state = 0;
else if (state == 0 || state == 2) result.Append(c);
};
Output
I like cars and bikes
Upvotes: 3