Ryan Saxe
Ryan Saxe

Reputation: 17869

Matching all Full Quotes with Regex

so matching quotes when you don't know if it will be single or double is fairly easy:

>>> s ="""this is a "test" that I am "testing" today"""
>>> re.findall('[\'"].*?[\'"]',s)
['"test"', '"testing"']

that will search a string for either single or double quotes and get what is inbetween. But here is the issue:

It will close strings if they contain the other type of quote! Here are two examples to illustrate what I mean:

>>> s ="""this is a "test" and this "won't work right" at all"""
>>> re.findall('[\'"].*?[\'"]',s)
['"test"', '"won\'']
>>> s ="""something is "test" and this is "an 'inner' string" too"""
>>> re.findall('[\'"].*?[\'"]',s)
['"test"', '"an \'', '\' string"']

the regular expression '[\'"].*?[\'"]' will match a single quote with a double quote, which is clearly bad.

So what regular expression will match both types of quotes, but only match the actual string if it ends with the same kind of quote.

in case you're confused

Here are my desired outputs:

s ="""this is a "test" and this "won't work right" at all"""
re.findall(expression,s)
#prints ['"test"','"won\'t work right"']

s ="""something is "test" and this is "an 'inner' string" too"""
re.findall(expression,s)
['"test"', '"an \'inner\' string"',"'inner'"]

Upvotes: 3

Views: 157

Answers (1)

Blender
Blender

Reputation: 298582

Wrap your first character class in a capturing group and then refer to it on the other side with \1:

>>> re.findall(r'([\'"])(.*?)\1',s)
[('"', 'test'), ('"', "won't work right")]

Upvotes: 4

Related Questions