Omkar
Omkar

Reputation: 392

Taking selected groups in regex

I am trying to implement a regex which takes the string 3oranges as one group and does not take more than 1 match for this string. I also want to give the string 3oranges2apples4bananas and this should give 3 different groups accordingly.

I have tried creating a regex which gets the matches the way I want, but doesnt give only one group as a match. It gives 2 for each match : 3oranges gives oranges and 3oranges as matches.

Here is the regex I wrote. I am writing it this way because I want to restrict the order in which the data is given and also want to make sure I get a limited number of digits to process later on.

^(\d{1,4}(orange)){0,1}(\d{1,4}(apple)){0,1}(\d{1,4}(banana)){0,1}$

Like I said earlier, I want to get only a single group for desired match.

INPUT: 
3oranges2apples4bananas

OUTPUT: Matches found
3oranges
oranges
2apples
apples
4bananas
bananas

DESIRED OUTPUT:
3oranges
2apples
4bananas

Is what I am asking for possible, and if yes, how can I achieve this?

EDIT 1: I didnt realize the importance of the follow up question I had to this question, but I wont waste any time. I also want to consider it as oranges if the text is o, orange or oranges
Something like this:

^(\d{1,4}(oranges|orange|o)){0,1}(\d{1,4}(apples|apple|a)){0,1}(\d{1,4}(bananas|banana|b)){0,1}$

Upvotes: 1

Views: 61

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

You may use

var s = "3oranges2apples4bananas";
var ms = Regex.Match(s, @"^(\d{1,4}o(?:ranges?)?)?(\d{1,4}a(?:pples?)?)?(\d{1,4}b(?:ananas?)?)?$");
var results = ms.Groups.Cast<Group>().Select(y => y.Value).Skip(1);
Console.WriteLine(string.Join(", ", results));
// => 3oranges, 2apples, 4bananas

See the C# demo and the regex demo.

Pattern details

  • ^ - start of string
  • (\d{1,4}o(?:ranges?)?)? - Group 1 (optional): 1 to 4 digits and then orange followed with an optional s letter or o
  • (\d{1,4}a(?:pples?)?)? - Group 2 (optional): 1 to 4 digits and then apple followed with an optional s letter or a
  • (\d{1,4}b(?:ananas?)?)? - Group 3 (optional): 1 to 4 digits and then banana followed with an optional s letter or b
  • $ - end of string.

With ms.Groups.Cast<Group>().Select(y => y.Value).Skip(1), we get rid of the whole match in the results, and only get the captured substrings.

NOTE If your apples or oranges can be alternations of non-similar words you may use alternation as well:

@"^(\d{1,4}(?:oranges?|tangerines?))?(\d{1,4}(?:apples?|pears?))?(\d{1,4}(?:bananas?|peach(?:es)?))?$"

Upvotes: 3

Related Questions