EoinM
EoinM

Reputation: 185

Java Regex double backslash escaping special characters

I have a regular expression that I've tested with RegexPal which works as expected in RegexPal but not as expected in Java. I know it's due to the way in which characters are being escaped but I cannot see how to get around it.

My expression is:Pattern.compile("((([a-zA-Z0-9])([a-zA-Z0-9 ]*)\\?)+)"). What I'm trying to do here is decide whether or not something is a valid relative path to a directory (just in Windows for now), so it should match things like "Hello", "Hello World", "Hello\World", "Hello World\foo\bar" etc. Instead it will only match when the directory name contains a question mark eg. "Documents?". I think this is because of the fact that in my expression the double backslash must precede the question mark quantifier, but when the backslash is escaped what's seen by compile() is \? which it assumes is an escaped question mark.

Is there anyway to ensure that the question mark is not escaped? I've tried inserting parentheses around the double backslash but it just escapes the closing parenthesis and causes an "Unclosed group error"

Upvotes: 2

Views: 3211

Answers (1)

sp00m
sp00m

Reputation: 48817

Use 4 backslashes:

Pattern.compile("((([a-zA-Z0-9])([a-zA-Z0-9 ]*)\\\\?)+)")
                                               ^^^^
  1. You need to match a backslash char: \.
  2. A backslash is a special char for regexps (used for predefined classes such as \d for example), which needs to be escaped by another backslash: \\.
  3. As Java uses string literals for regexps, and a backslash also is a special char for string literals (used for the line feed char \n for example), each backslash needs to be escaped by another backslash: \\\\.

Upvotes: 9

Related Questions