Core Xii
Core Xii

Reputation: 6441

What's the regex to match anything except a double quote not preceded by a backslash?

In other words, I have a string like:

"anything, escaped double-quotes: \", yep" anything here NOT to be matched.

How do I match everything inside the quotes?

I'm thinking

^"((?<!\\)[^"]+)"

But my head spins, should that be a positive or a negative lookbehind? Or does it work at all?

How do I match any characters except a double-quote NOT preceded by a backslash?

Upvotes: 28

Views: 42195

Answers (3)

tripleee
tripleee

Reputation: 189377

Here's a variation which permits backslash + anything pairs generally, and then disallows backslashes which are not part of this construct.

^"([^\"]+|\\.)*"

In many regex dialects, you only need a single backslash inside a character class; but what exactly works depends on the regex dialect and on the host language.

Upvotes: 0

Konrad Rudolph
Konrad Rudolph

Reputation: 545588

No lookbehind necessary:

"(\\"|[^"])*"

So: match quotes, and inside them: either an escaped quote (\\") or any character except a quote ([^"]), arbitrarily many times (*).

Upvotes: 57

chaos
chaos

Reputation: 124297

"Not preceded by" translates directly to "negative lookbehind", so you'd want (?<!\\)".

Though here's a question that may ruin your day: what about the string "foo\\"? That is, a double-quote preceded by two backslashes, where in most escaping syntaxes we would be wanting to negate the special meaning of the second backslash by preceding it with the first.

That sort of thing is kind of why regexes aren't a substitute for parsers.

Upvotes: 5

Related Questions