My time string may be in one of the following formates ( x and y - integer numbers, h and m - symbols): x h y m x h y m y Examples: 1h 20m 45m 2h 120 What regular expression should I write to get x and y numbers from such string?

regex

Sergey Metlov

Reputation: 26291

Parse time string using regex

My time string may be in one of the following formates (x and y - integer numbers, h and m - symbols):

xh ym
xh
ym
y

Examples:

1h 20m
45m
2h
120

What regular expression should I write to get x and y numbers from such string?

Upvotes: 1

Answers (5)

porges

Reputation: 30580

I'm going to assume you're using .NET due to your username. :)

I think in this case, it's easier to use TimeSpan.ParseExact for this task.

You can specify a list of permitted formats (see here for the format for these) and ParseExact will read in the TimeSpan according to them.

Here is an example:

var formats = new[]{"h'h'", "h'h 'm'm'", "m'm'", "%m"};
    // I have assumed that a single number means minutes

foreach (var item in new[]{"23","1h 45m","1h","45m"})
{   
    TimeSpan timespan;
    if (TimeSpan.TryParseExact(item, formats, CultureInfo.InvariantCulture, out timespan))
    {
        // valid
        Console.WriteLine(timespan);
    }
}

Output:

00:23:00
01:45:00
01:00:00
00:45:00

The only problem with this is that it is rather inflexible. Additional whitespace in the middle will fail to validate. A more robust solution using Regex is:

var items = new[]{"23","1h 45m", "45m", "1h", "1h 45", "1h   45", "1h45m"};
foreach (var item in items)
{
    var match = Regex.Match(item, @"^(?=\d)((?<hours>\d+)h)?\s*((?<minutes>\d+)m?)?$", RegexOptions.ExplicitCapture);
    if (match.Success)
    {       
        int hours;
        int.TryParse(match.Groups["hours"].Value, out hours); // hours == 0 on failure

        int minutes;
        int.TryParse(match.Groups["minutes"].Value, out minutes);

        Console.WriteLine(new TimeSpan(0, hours, minutes, 0));
    }
}

Breakdown of the regex:

^ - start of string
(?=\d) - must start with a digit (do this because both parts are marked optional, but we want to make sure at least one is present)
((?<hours>\d+)h)? - hours (optional, capture into named group)
\s* - whitespace (optional)
((?<minutes>\d+)m?)? - minutes (optional, capture into named group, the 'm' is optional too)
$ - end of string

Upvotes: 3

Petar Ivanov

Reputation: 93020

Try this one: ^(?:(\d+)h\s*)?(?:(\d+)m?)?$:

var s = new[] { "1h 20m", "45m", "2h", "120", "1m 20m" };

foreach (var ss in s)
{
    var m = Regex.Match(ss, @"^(?:(\d+)h\s*)?(?:(\d+)m?)?$");

    int hour = m.Groups[1].Value == "" ? 0 : int.Parse(m.Groups[1].Value);
    int min = m.Groups[2].Value == "" ? 0 : int.Parse(m.Groups[2].Value);

    if (hour != 0 || min != 0)
        Console.WriteLine("Hours: " + hour + ", Mins: " + min);
    else
        Console.WriteLine("No match!");
}

Upvotes: 1

PhiLho

Reputation: 41132

I would say that mhyfritz' solution is simple, efficient and good if your input is only what you shown. If you ever need to handle corner cases, you can use a more discriminative expression:

^(\d+)(?:(h)(?:\s+(\d+)(m))?|(m?))$

But it can be overkill...

(get rid of ^ and $ if you need to detect such pattern in a larger body of text, of course).

Upvotes: 1

Manlio

Reputation: 10865

in bash

echo $string | awk '{for(i=1;i<=NF;i++) print $i}' | sed s/[hm]/""/g

Upvotes: 0

mhyfritz

Reputation: 8522

(\d+)([mh]?)(?:\s+(\d+)m)?

You can then inspect groups 1-3. For your examples those would be

('1', 'h', '20')
('45', 'm', '')
('2', 'h', '')
('120', '', '')

As always, you might want to use some anchors ^, $, \b...

Upvotes: 4

Parse time string using regex

Answers (5)

Related Questions