Reputation: 11100
Say if I have an initial string that could contain either an integer or a double, followed by a timescale. Eg, it could be 5.5hours or 30 mins, etc. The data I will be receiving in this format is notoriously none uniformed so, for example, I could receive data such as 5.5 hours. With the added full stop.
I wanted a way to extract an integer or double from such strings, however I am struggling with the possible inclusion of additional full stops/periods. I can easily isolate the numbers and fullstops by replacing the letters with emptyspace.
Can anybody please advise.
Thanks.
Upvotes: 1
Views: 320
Reputation: 33272
Use (named) groups to extract the info you need:
(?'val'\d+\.?\d*).*?
or: (?'val'\d+.?\d*)\w+.? should do the work, and you'll find the results in the named group 'val'.
Upvotes: 1
Reputation: 336498
\d+(?:\.\d+)?
should match your criteria:
\d+ # Match one or more digits
(?: # Try to match the following group:
\. # a dot
\d+ # one or more digits
)? # End of optional group
So, to iterate over all matches in your string:
Regex regexObj = new Regex(@"\d+(?:\.\d+)?");
Match matchResults = regexObj.Match(subjectString);
while (matchResults.Success) {
// matched number: matchResults.Value
matchResults = matchResults.NextMatch();
}
This regex will not match numbers in exponential notation like 1.05E-6
, obviously.
If you also want to catch the following timescale, then you can use
(\d+(?:\.\d+)?)\s*(\w+)
Now, after a match, matchResults.Groups[1]
will contain the number. matchResults.Groups[2]
will contain the word following the number which you can then check against your list of allowed words. This word is mandatory, i. e. if it's missing, the entire regex will fail - if you don't want that, add a ?
at the end.
Upvotes: 3
Reputation: 1588
Maybe something like this:
@"\b(\d+(?:\.\d+)?)\s+(?:hours|mins|seconds)\b"
Upvotes: 1