Reputation: 6144
I want to write a regular expression which can match following specification for string literals. For the last 10 hours, I've gone crazy over formulating various regular expressions which none seem to work. Finally I've boiled down to this one:
([^"]|(\\[.\n]))*\"
Basically, requirements are following:
Some sample strings which I need to correctly match are following:
Kindly someone please help me formulate such a Regex. In my opinion that Regex I've provided should do the job, but it's rather failing for no reason.
Upvotes: 1
Views: 2273
Reputation: 655259
Your regular expression is almost right, you just need to be aware that inside a character class the period .
is just a literal .
and not any character except newline. So:
([^"\\]|\\(.|\n))*\"
Or:
([^"\\]|\\[\s\S])*\"
Upvotes: 2
Reputation: 14089
I assumed that your string also starts with a " (Should your examples not start with it?)
The Lookaround construct seems most natural for me to use:
".*?"(?<!\\")
Given the input
"test" test2 "test \a test" "test \"test" "test\""
this will match:
"test"
"test \a test"
"test \"test"
"test\""
The regex reads:
Match the character “"” literally «"»
Match any single character that is not a line break character «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “"” literally «"»
Assert that it is impossible to match the regex below with the match ending at this position (negative lookbehind) «(?<!\\")»
Match the character “\” literally «\\»
Match the character “"” literally «"»
Upvotes: 0