Reputation: 25074
I came across a bug today in our legacy code which was using the Perl5Compiler
and Perl5Matcher
using the following regular expression to validate a UK postcodes:
((?i)(([A-Z]{2}[0-9]{1,2})|([A-Z]{1,2}[0-9][A-Z])|([A-Z][0-9]{1,2}))\\s([0-9][A-Z]{2})|(BFPO\\s\\d{1,4})|(GIR\\s0AA))
However, it failed to validate correctly for postcodes such as 'G12 4NNT' (the last section is only allowed to be a number followed by 2 letters in this case). I fixed this by using the java.util.regex.Pattern
class which correctly uses the above regular expression and passes all of my unit tests.
However, now I'm curious why it didn't work with the Perl5
ones. Is there a fundemental difference with regular expression syntax used by the two APIs?
Upvotes: 0
Views: 361
Reputation: 92996
I think the problem is the same than in the question to the above linked answer.
If you use in Java the matches()
method:
text.matches("((?i)(([A-Z]{2}[0-9]{1,2})|([A-Z]{1,2}[0-9][A-Z])|([A-Z][0-9]{1,2}))\\s([0-9][A-Z]{2})|(BFPO\\s\\d{1,4})|(GIR\\s0AA))");
it matches against the complete string, to have the same behaviour in Perl, you have to anchors around your expression:
^((?i)(([A-Z]{2}[0-9]{1,2})|([A-Z]{1,2}[0-9][A-Z])|([A-Z][0-9]{1,2}))\\s([0-9][A-Z]{2})|(BFPO\\s\\d{1,4})|(GIR\\s0AA))$
^
matches the start of the string
$
matches the end of the string
Upvotes: 2