Zack
Zack

Reputation: 385

Regular expression that matches quotes that are not escaped in JAVA

I am trying to create a regex in JAVA that would match a string like: 'test " abcd \" ef" test' Let's say that I would want to know if between the quotes I have the characters abcdef in this order and any other character between them (but since I'm interested only in the substring between the quotes, the character between them can't be a quote, except the case in which the quote is escaped) Is it possible to do this?

I managed to create this regex

("[^\"]*\"[^\"]*a[^\"]*b[^\"]*c[^\"]*d[^\"]*e[^\"]*f[^\"]*\"[^\"]*")

that works for any case except the ones with escaped quotes embedded in the string.

Upvotes: 0

Views: 158

Answers (1)

psmears
psmears

Reputation: 28000

You're almost there... add the case for the quoted quote, which can be matched with

\\\"

so each of your [^\"]* cases (except the first and last, I guess) should become

([^\"]|\\\")*

... but you also need to take care of backslashes (because, for example, in

"foo\\"

the final quote is a "real" (non-escaped) quote, even though there's a backslash before it.) So in fact you need the [^\"]* cases to become:

([^\"\\]|\\.)*

or in other words: match anything that's not \ or ", or is \ followed by a character that's ignored.

NB This will mean that, for example, in the string "xxx\abcdef" the "\a" will not be matched as an "a", but that's probably what you want (since "\a" typically denotes the ASCII "BEL" control character).

Upvotes: 1

Related Questions