j-hap
j-hap

Reputation: 333

RegEx: Grabbing values between not escaped quotation marks

This question is related to RegEx: Grabbing values between quotation marks

The RegEx from the best answer

(["'])(?:(?=(\\?))\2.)*?\1

tested with the

Debuggex Demo

also matches strings that start with an escaped double quote. I tried to extend the definition to work with a negativ lookbehind.

(["'](?<!\\))(?:(?=(\\?))\2.)*?\1

Debuggex Demo

but this does not change anything in the matched pattern. Any suggestions on how to exclude escaped singe / double quotes as a starting pattern?

I want to use this as a highlighting pattern in nedit, which supports regex-lookbehind.

example for desired matching:

<p>
  <span style="color: #ff0000">"str1"</span> notstr
  <span style="color: #ff0000">"str2"</span>
  \"notstr <span style="color: #ff0000">"str4"</span>
</p>

Upvotes: 0

Views: 339

Answers (1)

Sergey Kalinichenko
Sergey Kalinichenko

Reputation: 727057

Using negative lookbehind for the backslash not preceded by another backslash, i.e.

(?<!(?<!\\)\\)["']

solves the problem:

((?<!(?<!\\)\\)["'])(?:(?=(\\?))\2.)*?(?<!(?<!\\)\\)\1

Demo.

You should be very careful about this approach, because generally regex is not a good tool for parsing inputs in markup syntax. You would be better off using a full-scale parser, and then optionally applying regex to parts that you get back from it.

Upvotes: 1

Related Questions