Reputation: 11
I want to capture hours from file txt. For example:
In file there is some article. I'd like to take hours:
11:51
00:32
but I can't take 13:51
or 11:61
. My current code doesn't work.
while ($word = <$fh>) {
if ($word =~ /\d\d:\d\d/) {
print $word . "\n";
}
}
Upvotes: 1
Views: 77
Reputation: 6378
Of course the perl documentation perlretut
is the standard reference, but for reading along and trying things, you might look at Regexp::Debugger which installs a great command line regular expression editor and analysis tool called rxrx
. It's very simple but can be quite helpful in getting a feel for how the regular expression engine works.
If you're open to installing CPAN modules you'll get lots of help from the Regexp::Common
namespace (in your case Regexp::Common::time
might be useful). The Regexp::Common:: ...
modules simplify and "standardize" regular expressions for common categories. The best part is you can read the source code to figure out how to do it yourself if you are in a situation where CPAN modules are unavailable.
Here's @Miller' example using Regexp::Common::time
:
#!/usr/bin/env perl5
use strict;
use warnings;
use Regexp::Common qw(time);
while (my $line = <DATA>) {
if ($line =~ $RE{time}{hms}{-keep}) {
print "Time = $2:$3 \n";
}
}
__DATA__
11:51
00:32
13:51
11:61
Note that this will print 3
values as is, (here in Canada 13:51
is almost coffee time). See the POD for how to limit time patterns using strftime
compatible formats. It's also possible to use the module and fiddle with output in normal perl
manner (e.g.print "Time = $2:$3 \n" unless $2 > 12;
).
@Miller's approach is the simplest (+1 from me), but Regexp::Common
is a very useful tool.
Cheers,
Upvotes: 0
Reputation: 35198
Don't treat regexes like a crutch. If you do that, you fall into XY Problem
land.
If you can validate your captures using a simple if
statement, then do so. Don't get hung up on one solution method:
use strict;
use warnings;
while (my $line = <DATA>) {
while ($line =~ /\b(\d\d:\d\d)\b/g) {
my $time = $1;
my ($hour, $min) = split ':', $time;
if ($hour < 13 && $min < 60) {
print "Time = $time\n"
}
}
}
__DATA__
11:51
00:32
13:51
11:61
Outputs:
Time = 11:51
Time = 00:32
Upvotes: 2
Reputation: 71538
If you have the time in AM/PM format and can't have above 12:00, then you need to use numeric ranges:
/(?:0[1-9]|1[0-2]):[0-5]\d/
(?:0[1-9]|1[0-2])
will match either 0[1-9]
(01-09) or 1[0-2]
(10-12).
[0-5]\d
will match 00-59.
Upvotes: 1
Reputation: 424993
If you want to match hour, but only in lines that look like your sample.
^\d\d?(?=:\d\d?$)
Remove the question marks if hour and minute are always 2 digits.
Upvotes: 0