Reputation: 44842
Whenever I enter the following...
Pattern pmessage = Pattern.compile("\s*\p{Alnum}[\p{Alnum}\s]*");
Matcher mmessage = pmessage.matcher(message);
Matcher msubject = pmessage.matcher(subject);
I get a Invalid Escape Sequence
error. Anyone have any idea why / how I fix this?
Upvotes: 0
Views: 2743
Reputation: 80384
For a version of \p{Alpha}
that works on the Java native character set instead being stuck unsable to process anything else than legacy data from the 1960s, you need to use
alphabetics = "[\\pL\\pM\\p{Nl]";
For a version of numerics in the same sense, you have to choose which of these you want:
ASCII_digits = "[0-9]";
all_numbers = "\\pN";
decimal_numbers = "\\p{Nd}"
because which one applies various depending on circumstances. We’ll assume you copied one of those three to a numeric
variable.
Assuming you then want alphanumerics based on the definition above, you could then write:
alphanumerics = "[" + alphabetics + numerics + "]";
However, if what you mean by alphanumerics is the \w
sense of program identifiers, you have to add some stuff.
identifier_chars = "[\\pL\\pM\\p{Nd}\\p{Nl}\\p{Pc}[\\p{InEnclosedAlphanumerics}&&\\p{So}]]";
This issue is discussed at length in this answer, where you’ll also find a link to some alpha code of mine that does these transforms for you automatically. I hope to get a chance to rewrite it to take up less space this weekend.
Upvotes: 2
Reputation: 893
You didn't correctly escape your "\" characters : in java, "\s" will give you \s, so you should write :
Pattern.compile("\\s*\\p{Alnum}[\\p{Alnum}\\s]*");
Upvotes: 1
Reputation: 38390
Keep in mind, that backslashes are special characters in Java strings, that need to be escaped with an additional backslash:
Pattern.compile("\\s*\\p{Alnum}[\\p{Alnum}\\s]*");
Upvotes: 1
Reputation: 500227
Double each backslash: Pattern.compile("\\s*\\p{Alnum}[\\p{Alnum}\\s]*")
Backslashes inside string literals have a special meaning, and have to be duplicated in order for the actual backslash character to become part of the string (which is what is required in your regex example.)
Upvotes: 1