Chris
Chris

Reputation: 12282

regex for optional group

I'd like to parse the following sample string

foo :6

into two groups: Text and Number. The number group should be populated only if the character ":" precedes the number itself.

so:

foo 6 -> Text = "foo 6"
foo :6 -> Text = "foo", Number = "6"

The best I could come up with so far is

(?<Text>.+)(?=:(?<Number>\d+)h?)?

but that doesn't work because the first group greedily expands to the whole string.

Any suggestions?

Upvotes: 1

Views: 112

Answers (5)

BlackBear
BlackBear

Reputation: 22979

If you really want to use a regex you can write quite a simple one, without lookarounds:

(?<Text>[^:]+):?(?<Number>\d*)

In my opinion, regexes should be as simple as possible; if you do not want spaces around the Text group I suggest you use match.Groups["Text"].Value.Strip().

Note that if you are parsing a multiline string this pattern will not work because, as @OscarHermosilla mentioned below, [?:]+ will also match newlines. The fix is simple though, change it with [^:\n]

Upvotes: 5

Avinash Raj
Avinash Raj

Reputation: 174696

You don't need any seperate function for stripping the trailing whitespaces

The below regex would capture all the characters into the named group Text except :\d+(ie; : followed by one or more numbers). If it finds a colon followed by numbers, then it starts capturing the number into the named group Number

^(?<Text>(?:(?!:\d+).)+(?=$|\s+:(?<Number>\d+)$))

DEMO

String input = "foo 6";
String input1 = "foo :6";
Regex rgx = new Regex(@"^(?<Text>(?:(?!:\d+).)+(?=$|\s+:(?<Number>\d+)$))");

foreach (Match m in rgx.Matches(input))
{
Console.WriteLine(m.Groups["Text"].Value);
}
foreach (Match m in rgx.Matches(input1))
{
Console.WriteLine(m.Groups["Text"].Value);
Console.WriteLine(m.Groups["Number"].Value);
}

Output:

foo 6
foo
6

IDEONE

Upvotes: 2

Oscar Hermosilla
Oscar Hermosilla

Reputation: 530

You can repeat the group name text with an alternation. This way:

(?<Text>.+)\s+:(?<Number>\d)|(?<Text>.+)

DEMO

Based on the idea behind this post: Regex Pattern to Match, Excluding when... / Except between

Upvotes: 1

NeverHopeless
NeverHopeless

Reputation: 11233

You can try like:

(\D+)(?:\:(\d+))

or do a Regex.Split using this pattern:

(\s*\:\s*)

Upvotes: 0

Rahul Tripathi
Rahul Tripathi

Reputation: 172398

You can simply use split instead of regex:

"foo :6".Split(':');

Upvotes: 0

Related Questions