Reputation: 14276
If I want to find all the text inside brackets in a string using a regex, I would have something like this:
string text = "[the] [quick] brown [fox] jumps over [the] lazy dog";
Regex regex = new Regex(@"\[([^]]+)\]");
MatchCollection matches = regex.Matches(text);
foreach (Match match in matches)
{
... // Here is my problem!
}
I am not sure how to continue my code from here, if I just iterate through all matches, I will get "the"
, "quick"
, "fox"
and "the"
, I was expecting to get the two the
grouped in the same Match.Group
, just at different indexes.
What I really want is to get the two "the"
grouped in such a way I can find all occurrences of the same word and their indexes.
I was hoping the API will give me something like this:
foreach (Match match in matches)
{
for (int i = 1; i < match.Groups.Count; i++)
{
StartIndexesList.Add(match.Groups[i].Index);
}
}
Where each match.Group
will hold a reference to the same occurrence in the text of some found token, so I expected this code will add all the the
text index references to a list at once, but it doesn't, it just adds for each separate occurrence, not all at once.
How can I achieve this without post processing all the tokens to see if there are repeated ones?
Upvotes: 0
Views: 38
Reputation: 6374
Is this what you are looking for?
string text = "[the] [quick] brown [fox] jumps over [the] lazy dog";
Regex regex = new Regex(@"\[([^]]+)\]");
MatchCollection matches = regex.Matches(text);
foreach (IGrouping<string, Match> group in matches.Cast<Match>().GroupBy(_ => _.Value))
{
Console.WriteLine(group.Key); // This will print '[the]'
foreach (Match match in group) // It will iterate through all matches of '[the]'
{
// do your stuff
}
}
Upvotes: 1