Reputation: 155
I was using this pattern to match single quoted strings in parser:
"'.+?'"
But I need regex that can find single quoted string with postgres like escape of single qoutes (doubling single qoutes). Need to match something like this:
"'first', 'sec''ond', 't''hi''rd'"
I want to find shortest matches for strings that start and end with single single quotes, so the string above would mean 3 substrings:
'first'
'sec''ond'
't''hi''rd'
Upvotes: 2
Views: 3751
Reputation: 626748
Certainly, '(?:[^']|'')*'
is the working regex for this: it matches a '
followed with zero or more characters other than '
or double '
s followed with a trailing '
.
However, to make it more efficient, you can unroll it using the unroll-the-loop technique.
'[^']*(?:''[^']*)*'
See the regex demo and pay attention how many steps it takes for the regexps to find all matches.
The regex can be read as
'
- match a '
[^']*
- then zero or more characters other than '
(?:''[^']*)*
- then zero or more sequences of ''
followed with zero or more characters other than '
'
- and then match the trailing '
.This regex has a linear pattern involving as little backtracking as possible.
Just a note: you can still make your regex work for the current scenario if you add a lookahead checking if there is a ,
or the end of string after the trailing '
:
'.+?'(?=,|$)
^^^^^^^
See the regex demo. However, it is context dependent and less efficient than the unrolled regex.
Upvotes: 5
Reputation: 225
For the pattern you supplied this should work:
'[\w']+'
That is match a single quote followed by one or more non-whitespace or single quote followed by a final single quote.
Upvotes: 1
Reputation: 16065
'(?:[^']|'{2})+'
a single quote, followed by greedy occurrences of:
followed by a single quote.
demo: https://regex101.com/r/zP2eK6/1
Upvotes: 2