jurkij
jurkij

Reputation: 155

Python regex to match quoted string with escaped single quotes

I was using this pattern to match single quoted strings in parser:

"'.+?'"

But I need regex that can find single quoted string with postgres like escape of single qoutes (doubling single qoutes). Need to match something like this:

"'first', 'sec''ond', 't''hi''rd'"

I want to find shortest matches for strings that start and end with single single quotes, so the string above would mean 3 substrings:

'first'
'sec''ond'
't''hi''rd'

Upvotes: 2

Views: 3751

Answers (3)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626748

Certainly, '(?:[^']|'')*' is the working regex for this: it matches a ' followed with zero or more characters other than ' or double 's followed with a trailing '.

However, to make it more efficient, you can unroll it using the unroll-the-loop technique.

'[^']*(?:''[^']*)*'

See the regex demo and pay attention how many steps it takes for the regexps to find all matches.

The regex can be read as

  • ' - match a '
  • [^']* - then zero or more characters other than '
  • (?:''[^']*)* - then zero or more sequences of '' followed with zero or more characters other than '
  • ' - and then match the trailing '.

This regex has a linear pattern involving as little backtracking as possible.

Just a note: you can still make your regex work for the current scenario if you add a lookahead checking if there is a , or the end of string after the trailing ':

'.+?'(?=,|$)
     ^^^^^^^

See the regex demo. However, it is context dependent and less efficient than the unrolled regex.

Upvotes: 5

snoopen
snoopen

Reputation: 225

For the pattern you supplied this should work:

'[\w']+'

That is match a single quote followed by one or more non-whitespace or single quote followed by a final single quote.

Upvotes: 1

Keith Hall
Keith Hall

Reputation: 16065

'(?:[^']|'{2})+'

a single quote, followed by greedy occurrences of:

  • either a character that is not a single quote
  • or two single quotes together

followed by a single quote.

demo: https://regex101.com/r/zP2eK6/1

Upvotes: 2

Related Questions