Reputation: 6441
In other words, I have a string like:
"anything, escaped double-quotes: \", yep" anything here NOT to be matched.
How do I match everything inside the quotes?
I'm thinking
^"((?<!\\)[^"]+)"
But my head spins, should that be a positive or a negative lookbehind? Or does it work at all?
How do I match any characters except a double-quote NOT preceded by a backslash?
Upvotes: 28
Views: 42195
Reputation: 189377
Here's a variation which permits backslash + anything pairs generally, and then disallows backslashes which are not part of this construct.
^"([^\"]+|\\.)*"
In many regex dialects, you only need a single backslash inside a character class; but what exactly works depends on the regex dialect and on the host language.
Upvotes: 0
Reputation: 545588
No lookbehind necessary:
"(\\"|[^"])*"
So: match quotes, and inside them: either an escaped quote (\\"
) or any character except a quote ([^"]
), arbitrarily many times (*
).
Upvotes: 57
Reputation: 124297
"Not preceded by" translates directly to "negative lookbehind", so you'd want (?<!\\)"
.
Though here's a question that may ruin your day: what about the string "foo\\"
? That is, a double-quote preceded by two backslashes, where in most escaping syntaxes we would be wanting to negate the special meaning of the second backslash by preceding it with the first.
That sort of thing is kind of why regexes aren't a substitute for parsers.
Upvotes: 5