Arav
Arav

Reputation: 5247

perl regular expressions with date

Trying to validate the date of the format (YYYY_MM_DD). With the test variable set as 2012_4_123 it's printing "valid format" after script is run. It should give an "invalid error" message because in the regular expression the day part is checked to be atleast 1 digit and not more than 2 digits. Not sure how it's printing "valid format" as the output message.

my $test="2012_4_123";
if ($test !~ m/^(\d{4})_(\d{1,2})_(\d{1,2})/)
{
  print "invalid format\n";
}
else
{
 print  "valid format\n";
}

Upvotes: 0

Views: 2997

Answers (3)

edb
edb

Reputation: 502

Simply adding $ solves the initial problem of allowing for more than two digits for the day, but introduces a more subtle bug: dates will now validate despite having a newline at the end. This may not matter depending on your application, but it can be avoided by using the regex in the following example:

use strict;
use warnings;

my @tests = (
    '2012_4_123',
    '2012_11_22',
    "2012_11_22\n",
);

use Data::Dumper;
print Dumper \@tests;

foreach my $test (@tests) {
    if ( $test !~ m/\A(\d{4})_(\d{1,2})_(\d{1,2})\z/smx )
    {
        print "invalid format\n";
    }
    else
    {
        print "valid format\n";
    }
}

Note: /smx is recommended by Perl Best Practices and I write my regexes with it unless there's a specific need not to have it, but it may trip you up if you're not used to it.

/s and /m will allow you to process multiline strings more easily; /s because . will then match newlines and /m to allow you to use ^ and $ to match the start and end of a line respectively, and \A and \z will then match the start and end of the entire string.

/x is simply to allow whitespace and comments within a regex, though you'll need to escape whitespace if you're actually trying to match it.

In this case, it's using \z instead of $ that makes the difference irrespective of the use of /smx.

Also, it mightn't be a bad idea to look at a module to perform date validation rather than just date format validation (again, depending on what you're using this for). See this discussion on perlmonks.

Upvotes: 1

John Corbett
John Corbett

Reputation: 1615

you're missing a $ at the end. it's matching the string "2012_4_12" because you didn't tell it to match the end of the string too. Your regex should be this.

$test !~ m/^(\d{4})_(\d{1,2})_(\d{1,2})$/

Upvotes: 1

CyberDem0n
CyberDem0n

Reputation: 15066

-if ($test !~ m/^(\d{4})_(\d{1,2})_(\d{1,2})/)
+if ($test !~ m/^(\d{4})_(\d{1,2})_(\d{1,2})$/)

Upvotes: 1

Related Questions