Reputation: 27
I'm a regex newbie, and I am trying to use a regex to return a list of dates from a text file. The dates are in mm/dd/yy format, so for years it would be '55' for '1955', for example. I am trying to return all entries from years'50' to '99'.
I believe the problem I am having is that once my regex finds a match on a line, it stops right there and jumps to the next line without checking the rest of the line. For example, I have the dates 12/12/12, 10/10/57, 10/09/66 all on one line in the text file, and it only returns 10/10/57.
Here is my code thus far. Any hints or tips? Thank you
open INPUT, "< dates.txt" or die "Can't open input file: $!";
while (my $line = <INPUT>){
if ($line =~ /(\d\d)\/(\d\d)\/([5-9][0-9])/g){
print "$&\n" ;
}
}
Upvotes: 0
Views: 75
Reputation: 126762
A few points about your code
You must always use strict
and use warnings 'all'
at the top of all your Perl programs
You should prefer lexical file handles and the three-parameter form of open
If your regex pattern contains literal slashes then it is clearest to use a non-standard delimiter so that they don't need to be escaped
Although recent releases of Perl have fixed the issue, there used to be a significant performance hit when using $&
, so it is best to avoid it, at least for now. Put capturing parentheses around the whole pattern and use $1
instead
This program will do as you ask
use strict;
use warnings 'all';
open my $fh, '<', 'dates.txt' or die "Can't open input file: $!";
while ( <$fh> ) {
print $1, "\n" while m{(\d\d/\d\d/[5-9][0-9])}g
}
10/10/57
10/09/66
Upvotes: 1
Reputation: 11
Another way around it is removing the dates you don't want.
$line =~ s/\d\d\/\d\d\/[0-4]\d//g;
print $line;
Upvotes: 0
Reputation: 886
You can use map also to get year range 50 to 99 and store in array
open INPUT, "< dates.txt" or die "Can't open input file: $!";
@as = map{$_ =~ m/\d\d\/\d\d\/[5-9][0-9]/g} <INPUT>;
$, = "\n";
print @as;
Upvotes: 0
Reputation: 2808
You just need to change the 'if' to a 'while' and the regex will take up where it left off;
open INPUT, "< a.dat" or die "Can't open input file: $!";
while (my $line = <INPUT>){
while ($line =~ /(\d\d)\/(\d\d)\/([5-9][0-9])/g){
print "$&\n" ;
}
}
# Output given line above
# 10/10/57
# 10/09/66
You could also capture the whole of the date into one capture variable and use a different regex delimiter to save escaping the slashes:
while ($line =~ m|(\d\d/\d\d/[5-9]\d)|g) {
print "$1\n" ;
}
...but that's a matter of taste, perhaps.
Upvotes: 0
Reputation: 3535
You are printing $&
which gets updated whenever any new match is encountered.
But in this case you need to store the all the previous matches and the updated one too, so you can use array for storing all the matches.
while(<$fh>) {
@dates = $_ =~ /(\d\d)\/(\d\d)\/([5-9][0-9])/g;
print "@dates\n" if(@dates);
}
Upvotes: 0