Reputation: 259
thanks for looking,
I've had a terrible time trying to get the right search terms for this regex question. I need to ensure that quotes are already escaped in a string, otherwise the match should fail. (Most search results for this kind of question are just pages saying you need to escape quotes or how to escape quotes.)
Valid:
This is valid
This \"is Valid
This is al\"so Valid\"
Invalid:
This i"s invalid
This i"s inv"alid
The only thing I've managed to find so far is
((?:\\"|[^"])*)
This seems to match the first part of the following, but nothing after the escaped quote
This is a \"test
Again, this should fail:
This is a \"test of " the emergency broadcast system
Thanks for any help, I hope this is even possible.
Upvotes: 6
Views: 387
Reputation: 785068
RegEx you're looking for is:
/^(?:[^"]*(?:(?<=\\\)"|))*$/
Explanation: [^"]*
will match input until first "
is found or end of input is reached. If "
is found then make sure in (?<=\\\)"
lookbehind that is always preceded by /
. Above scenario is recursively repeated until end of input is reached.
TESTING: Consider following PHP code to test:
$arr=array('This is valid',
'This \"is Valid',
'This is al\"so Valid\"',
'This i"s invalid',
'This i"s inv"alid',
'This is a \"test',
'This is a \"test of " the emergency broadcast system - invalid');
foreach ($arr as $a) {
echo "$a => ";
if (preg_match('/^(?:[^"]*(?:(?<=\\\)"|))*$/', $a, $m))
echo "matched [$m[0]]\n";
else
echo "didn't match\n";
}
OUTPUT:
This is valid => matched [This is valid]
This \"is Valid => matched [This \"is Valid]
This is al\"so Valid\" => matched [This is al\"so Valid\"]
This i"s invalid => didn't match
This i"s inv"alid => didn't match
This is a \"test => matched [This is a \"test]
This is a \"test of " the emergency broadcast system - invalid => didn't match
Upvotes: 1
Reputation: 79175
You need to take everything except a backslash and a quote, or a backslash and the next character.
([^\\"]|\\.)*
This way, this will fail:
ab\\"c
This will succeed:
ab\\\"c
This will succeed:
ab\"c
Upvotes: 1
Reputation: 15227
You want to use a negative lookbehind.
(?<!\\)"
This regex will match all quotes that are not preceded by a single slash.
If you run this regex against your sample string and it finds 1 or more matches, then the string is not valid.
Upvotes: 2
Reputation: 10326
In C#, this appears to work as you want:
string pattern = "^([^\"\\\\]*(\\\\.)?)*$";
Stripping out the escaping leaves you with:
^([^"\\]*(\\.)?)*$
which roughly translates into: start-of-string, (multi-chars-excluding-quote-or-backslash, optional-backslash-anychar)-repeated, end-of-string
It's the start-of-string and end-of-string markers which forces the match over the complete text.
Upvotes: 6
Reputation:
Don't know the language you use, but I would have done it in this way:
make a regexp, that matches a quote without a backslash, which will fail on
This is a \"test
and succeeded on
This is a \"test of " the emergency broadcast system
for example this one:
.*(?<!\\)".*
and then will use negative expression with the result. hope this will help you
my test in java looks like
String pat = ".*(?<!\\\\)\".*";
String s = "This is a \\\"test";
System.out.println(!s.matches(pat));
s = "This is a \\\"test of \" the emergency broadcast system";
System.out.println(!s.matches(pat));
Upvotes: 2