user3617297
user3617297

Reputation: 11

Capture times from a string

I want to capture hours from file txt. For example:

In file there is some article. I'd like to take hours:

11:51
00:32

but I can't take 13:51 or 11:61. My current code doesn't work.

while ($word = <$fh>) {
    if ($word =~ /\d\d:\d\d/) {
        print $word . "\n";
    }
}

Upvotes: 1

Views: 77

Answers (4)

G. Cito
G. Cito

Reputation: 6378

Of course the perl documentation perlretut is the standard reference, but for reading along and trying things, you might look at Regexp::Debugger which installs a great command line regular expression editor and analysis tool called rxrx. It's very simple but can be quite helpful in getting a feel for how the regular expression engine works.

If you're open to installing CPAN modules you'll get lots of help from the Regexp::Common namespace (in your case Regexp::Common::time might be useful). The Regexp::Common:: ... modules simplify and "standardize" regular expressions for common categories. The best part is you can read the source code to figure out how to do it yourself if you are in a situation where CPAN modules are unavailable.

Here's @Miller' example using Regexp::Common::time:

#!/usr/bin/env perl5 
use strict;       
use warnings; 
use Regexp::Common qw(time);

while (my $line = <DATA>) {
  if ($line =~ $RE{time}{hms}{-keep}) { 
    print "Time = $2:$3 \n";  
  } 
}    

__DATA__         
11:51          
00:32  
13:51                                                                           
11:61    

Note that this will print 3 values as is, (here in Canada 13:51 is almost coffee time). See the POD for how to limit time patterns using strftime compatible formats. It's also possible to use the module and fiddle with output in normal perl manner (e.g.print "Time = $2:$3 \n" unless $2 > 12;).

@Miller's approach is the simplest (+1 from me), but Regexp::Common is a very useful tool.

Cheers,

Upvotes: 0

Miller
Miller

Reputation: 35198

Don't treat regexes like a crutch. If you do that, you fall into XY Problem land.

If you can validate your captures using a simple if statement, then do so. Don't get hung up on one solution method:

use strict;
use warnings;

while (my $line = <DATA>) {
    while ($line =~ /\b(\d\d:\d\d)\b/g) {
        my $time = $1;
        my ($hour, $min) = split ':', $time;
        if ($hour < 13 && $min < 60) {
            print "Time = $time\n"
        }
    }
}

__DATA__
11:51
00:32
13:51
11:61

Outputs:

Time = 11:51
Time = 00:32

Upvotes: 2

Jerry
Jerry

Reputation: 71538

If you have the time in AM/PM format and can't have above 12:00, then you need to use numeric ranges:

/(?:0[1-9]|1[0-2]):[0-5]\d/

(?:0[1-9]|1[0-2]) will match either 0[1-9] (01-09) or 1[0-2] (10-12).

[0-5]\d will match 00-59.

Upvotes: 1

Bohemian
Bohemian

Reputation: 424993

If you want to match hour, but only in lines that look like your sample.

^\d\d?(?=:\d\d?$)

Remove the question marks if hour and minute are always 2 digits.

Upvotes: 0

Related Questions