Reputation: 13
My Perl skills are pretty rudimentary and I'm trying to convert dates in a data file loaded in a scalar variable to a four digit year using a regular expression substitution (among other things).
I've got the following to work to add a 20 to all years.
$data00 =~ s/^D(\d{2})\/(\d{2})\/(\d{2})\n/D$1\/$2\/20$3\n/gm;
However, the dates include those before 2000.
While searching for a solution I ran across the /e option which said that it evaluates the replacement as Perl code. However I don't find it listed in all the documentation I've run across and I'm not sure what the syntax would be.
Is there a way to evaluate the $3 match and output 20 if $3 is less than 50 to make 2000 and 19 if not, to make 1997? I selected 50 because it seemed to be a safe middle ground.
For illustration purposes though I know it's incorrect:
$data00 =~ s/^D(\d{2})\/(\d{2})\/(\d{2})\n/D$1\/$2\/(if($3<50)20 else 19)$3\n/eg;
Is the /e even appropriate in this case?
Line examples extracted from huge text file.
D04/07/97
D04/14/98
D10/06/99
D10/13/05
D03/04/10
D12/09/10
D01/20/11
D12/22/11
Upvotes: 1
Views: 640
Reputation: 69224
I'd use Time::Piece to do this. Use the strptime()
class method to parse the date into an object, and then strftime()
to format it.
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
use Time::Piece;
while (<DATA>) {
chomp;
my $date = Time::Piece->strptime($_, 'D%m/%d/%y');
say $date->strftime('D%m/%d/%Y');
}
__DATA__
D04/07/97
D04/14/98
D10/06/99
D10/13/05
D03/04/10
D12/09/10
D01/20/11
D12/22/11
Output:
D04/07/1997
D04/14/1998
D10/06/1999
D10/13/2005
D03/04/2010
D12/09/2010
D01/20/2011
D12/22/2011
The regex solution can be simplified by a) choosing a different delimiter and b) using the ternary operator. If you use /e
then the replacement text needs to be syntactically valid Perl.
while (<DATA>) {
chomp;
s|D(\d{2}/\d{2}/)(\d{2})|"D$1" . ($2 < 50 ? '20' : '19') . $2|e;
say;
}
Update: There's one (possibly important) difference between the two solutions - the cut-off between the 20th and 21st centuries when converting from two-digit years to four-digit ones. The regex solution uses 50 (as mentioned in the original question). The Time::Piece solution uses 69 - and that limit is hard-coded, so there's no way of changing it. For the data in the original question, that makes no difference. But it might matter if you have data with a year between 1950 and 1969.
Upvotes: 1
Reputation: 106445
When using /e
, the replacement expression must be a valid Perl expression (i.e. what you could put following $x =
).
You can use the conditional operator (?:
) to evaluate an expression differently based on a condition:
s/^D(\d{2})\/(\d{2})\/(\d{2})\n/ "D$1\/$2\/".( $3 < 50 ? 20 : 19 )."$3\n" /eg
Note that replacing the delimiter can make things far more readable when many /
are involved.
s{^D(\d{2})/(\d{2})/(\d{2})\n}{ "D$1/$2/".( $3 < 50 ? 20 : 19 )."$3\n" }eg
Upvotes: 3