Sergey Metlov
Sergey Metlov

Reputation: 26291

Parse time string using regex

My time string may be in one of the following formates (x and y - integer numbers, h and m - symbols):

Examples:

What regular expression should I write to get x and y numbers from such string?

Upvotes: 1

Views: 1914

Answers (5)

porges
porges

Reputation: 30580

I'm going to assume you're using .NET due to your username. :)

I think in this case, it's easier to use TimeSpan.ParseExact for this task.

You can specify a list of permitted formats (see here for the format for these) and ParseExact will read in the TimeSpan according to them.

Here is an example:

var formats = new[]{"h'h'", "h'h 'm'm'", "m'm'", "%m"};
    // I have assumed that a single number means minutes

foreach (var item in new[]{"23","1h 45m","1h","45m"})
{   
    TimeSpan timespan;
    if (TimeSpan.TryParseExact(item, formats, CultureInfo.InvariantCulture, out timespan))
    {
        // valid
        Console.WriteLine(timespan);
    }
}

Output:

00:23:00
01:45:00
01:00:00
00:45:00

The only problem with this is that it is rather inflexible. Additional whitespace in the middle will fail to validate. A more robust solution using Regex is:

var items = new[]{"23","1h 45m", "45m", "1h", "1h 45", "1h   45", "1h45m"};
foreach (var item in items)
{
    var match = Regex.Match(item, @"^(?=\d)((?<hours>\d+)h)?\s*((?<minutes>\d+)m?)?$", RegexOptions.ExplicitCapture);
    if (match.Success)
    {       
        int hours;
        int.TryParse(match.Groups["hours"].Value, out hours); // hours == 0 on failure

        int minutes;
        int.TryParse(match.Groups["minutes"].Value, out minutes);

        Console.WriteLine(new TimeSpan(0, hours, minutes, 0));
    }
}

Breakdown of the regex:

  • ^ - start of string
  • (?=\d) - must start with a digit (do this because both parts are marked optional, but we want to make sure at least one is present)
  • ((?<hours>\d+)h)? - hours (optional, capture into named group)
  • \s* - whitespace (optional)
  • ((?<minutes>\d+)m?)? - minutes (optional, capture into named group, the 'm' is optional too)
  • $ - end of string

Upvotes: 3

Petar Ivanov
Petar Ivanov

Reputation: 93020

Try this one: ^(?:(\d+)h\s*)?(?:(\d+)m?)?$:

var s = new[] { "1h 20m", "45m", "2h", "120", "1m 20m" };

foreach (var ss in s)
{
    var m = Regex.Match(ss, @"^(?:(\d+)h\s*)?(?:(\d+)m?)?$");

    int hour = m.Groups[1].Value == "" ? 0 : int.Parse(m.Groups[1].Value);
    int min = m.Groups[2].Value == "" ? 0 : int.Parse(m.Groups[2].Value);

    if (hour != 0 || min != 0)
        Console.WriteLine("Hours: " + hour + ", Mins: " + min);
    else
        Console.WriteLine("No match!");
}

Upvotes: 1

PhiLho
PhiLho

Reputation: 41132

I would say that mhyfritz' solution is simple, efficient and good if your input is only what you shown. If you ever need to handle corner cases, you can use a more discriminative expression:

^(\d+)(?:(h)(?:\s+(\d+)(m))?|(m?))$

But it can be overkill...

(get rid of ^ and $ if you need to detect such pattern in a larger body of text, of course).

Upvotes: 1

Manlio
Manlio

Reputation: 10865

in bash

echo $string | awk '{for(i=1;i<=NF;i++) print $i}' | sed s/[hm]/""/g

Upvotes: 0

mhyfritz
mhyfritz

Reputation: 8522

(\d+)([mh]?)(?:\s+(\d+)m)?

You can then inspect groups 1-3. For your examples those would be

('1', 'h', '20')
('45', 'm', '')
('2', 'h', '')
('120', '', '')

As always, you might want to use some anchors ^, $, \b...

Upvotes: 4

Related Questions