davis
davis

Reputation: 381

Regex - Match a certain string and get the integer value of the string

I have a regex:

var example= Regex.Match(result, @"\b(Today is:)[\s:]*(.*)", RegexOptions.IgnoreCase);

and then I convert

example= int.Parse(result.Groups[2].Value, System.Globalization.NumberStyles.AllowThousands);

This works fine most of the time, however, I noticed that if I have extra string after Today is: for example,

Today is (extra):

My regex above fails for this case, because it also grabs "(extra)" and then goes to int.Parse, it fails. I want my regex to match when there's "Today is:", then it doesn't matter even if there are more strings, just get the string and convert into int value.

For example, Today is: 100,000,000 -> convert and get int 100000000

Today is (abc123): 88,888 -> convert and get int 88888

Today is (Extra Text blah blah): 100,000 -> convert and get int 100000

Upvotes: 0

Views: 654

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626861

You can use

\bToday\s+is\b(?:.*?\([^()]*\))?.*?\b(\d+(?:,\d{3})*(?:\.\d+)?)

See the regex demo. Parse the match.Groups[1].Value.

Details:

  • \bToday\s+is\b - Today is as whole words with any one or more whitespaces in between
  • (?:.*?\([^()]*\))? - an optional sequence of any zero or more chars other than newline char as few as possible followed with a (, zero or more chars other than ( and ) and then a ) char
  • .*? - any zero or more chars other than newline char as few as possible
  • \b - a word boundary
  • (\d+(?:,\d{3})*(?:\.\d+)?) - Group 1: a number pattern.

Upvotes: 0

Patrick Janser
Patrick Janser

Reputation: 4244

I would change a bit your regex like this:

\bToday is\b.*?\s*:\s*([\d,\.]+)

Test it here: https://regex101.com/r/jPb6Pa/1

Explanation:

  • \bToday is\b for searching "Today is" and not "Blablatoday isn't" or something like that.

  • .*? searches anything after "Today is" but in an ungready way.

  • \s*:\s* searches for the ":" char with or without spaces around.

  • The capturing group n°1 ([\d,\.]+) will search for digits, points and commas, at least one character. It could be improved as a single comma or point would be wrong. But it does the job for the moment.

Upvotes: 2

Related Questions