Bung
Bung

Reputation: 259

Regex: Require that quotes are escaped in a string

thanks for looking,

I've had a terrible time trying to get the right search terms for this regex question. I need to ensure that quotes are already escaped in a string, otherwise the match should fail. (Most search results for this kind of question are just pages saying you need to escape quotes or how to escape quotes.)

Valid:

This is valid
This \"is Valid
This is al\"so Valid\"

Invalid:

This i"s invalid
This i"s inv"alid

The only thing I've managed to find so far is

((?:\\"|[^"])*)

This seems to match the first part of the following, but nothing after the escaped quote

This is a \"test

Again, this should fail:

This is a \"test of " the emergency broadcast system

Thanks for any help, I hope this is even possible.

Upvotes: 6

Views: 387

Answers (5)

anubhava
anubhava

Reputation: 785068

RegEx you're looking for is:

/^(?:[^"]*(?:(?<=\\\)"|))*$/

Explanation: [^"]* will match input until first " is found or end of input is reached. If " is found then make sure in (?<=\\\)" lookbehind that is always preceded by /. Above scenario is recursively repeated until end of input is reached.

TESTING: Consider following PHP code to test:

$arr=array('This is valid',
'This \"is Valid',
'This is al\"so Valid\"',
'This i"s invalid',
'This i"s inv"alid',
'This is a \"test',
'This is a \"test of " the emergency broadcast system - invalid');
foreach ($arr as $a) {
   echo "$a => ";
   if (preg_match('/^(?:[^"]*(?:(?<=\\\)"|))*$/', $a, $m))
      echo "matched [$m[0]]\n";
   else
      echo "didn't match\n";
}

OUTPUT:

This is valid => matched [This is valid]
This \"is Valid => matched [This \"is Valid]
This is al\"so Valid\" => matched [This is al\"so Valid\"]
This i"s invalid => didn't match
This i"s inv"alid => didn't match
This is a \"test => matched [This is a \"test]
This is a \"test of " the emergency broadcast system - invalid => didn't match

Upvotes: 1

Benoit
Benoit

Reputation: 79175

You need to take everything except a backslash and a quote, or a backslash and the next character.

([^\\"]|\\.)*

This way, this will fail:

ab\\"c

This will succeed:

ab\\\"c

This will succeed:

ab\"c

Upvotes: 1

viggity
viggity

Reputation: 15227

You want to use a negative lookbehind.

(?<!\\)"

This regex will match all quotes that are not preceded by a single slash.

If you run this regex against your sample string and it finds 1 or more matches, then the string is not valid.

Upvotes: 2

adelphus
adelphus

Reputation: 10326

In C#, this appears to work as you want:

string pattern = "^([^\"\\\\]*(\\\\.)?)*$";

Stripping out the escaping leaves you with:

^([^"\\]*(\\.)?)*$

which roughly translates into: start-of-string, (multi-chars-excluding-quote-or-backslash, optional-backslash-anychar)-repeated, end-of-string

It's the start-of-string and end-of-string markers which forces the match over the complete text.

Upvotes: 6

user739881
user739881

Reputation:

Don't know the language you use, but I would have done it in this way:

make a regexp, that matches a quote without a backslash, which will fail on

This is a \"test

and succeeded on

This is a \"test of " the emergency broadcast system

for example this one:

.*(?<!\\)".*

and then will use negative expression with the result. hope this will help you

my test in java looks like

    String pat = ".*(?<!\\\\)\".*";
    String s = "This is a \\\"test";
    System.out.println(!s.matches(pat));
    s = "This is a \\\"test of \" the emergency broadcast system";
    System.out.println(!s.matches(pat));

Upvotes: 2

Related Questions