Reputation: 1
Please consider the regex pattern : .*[a-zA-Z0-9\\-\\_].*
.
If I use Java regex pattern matching to match "-"
, it says it is true.
String regexCostcode1=".*[a-zA-Z0-9\\-\\_].*";
Pattern regex_costcode=Pattern.compile(regexCostcode1);
String test="-";
Matcher m = regex_costcode.matcher(test);
System.out.println(m.matches());
This prints true.
But same regex fails for "-"
in XSD schema validation.
I checked using http://regexr.com/ it fails to match "-"
.
So why it is matching using Java pattern matching?
Upvotes: 0
Views: 738
Reputation: 7273
For non-Java regexes you don't need to use double back-slashes. So your regex should be .*[a-zA-Z0-9\\-\\_].*
in Java and .*[a-zA-Z0-9\-\_].*
in XSD schema validation.
If you input .*[a-zA-Z0-9\\-\\_].*
in the site you mentioned, it tells you that \\-\\
is being interpreted as a "range of characters from \ to \" since \\
is just an escaped back-slash.
If you input .*[a-zA-Z0-9\-\_].*
it interprets \-
as just an escaped hypen and correctly matches -
.
Upvotes: 1
Reputation: 627341
Mind that in a Java string literal you need 2 backslashes to define a literal backslash. When you use \\
at the regexr.com, or in XML Schema regex, you use 2 literal backslashes that match a literal backslash in the input string, and the [\\-\\]
construct matches a single \
.
In XML Schema, you need to define the regex as
<xs:pattern value=".*[a-zA-Z0-9_-].*"/>
Put the -
at the end of the character class to be parsed as a literal -
. The underscore does not need to be escaped at all, as it is never a special char (it is actually a "word" char).
Actually, I'd advise to use ".*[a-zA-Z0-9_-].*"
in Java, too, to avoid any ambiguity.
Upvotes: 2