regex to match between certain characters

Question

I have strings like this...

"1. yada yada yada (This is a string; "This is a thing")
 2. blah blah blah (This is also a string)"

I want to return...

['this is a string', 'this is also a string']

so it should match everything between the '(' and ';' or between '(' and ')'

this is what I have so far in python matches the sections I want, but I can't figure out how to cut them down to return what I really want inside them...

pattern = re.compile('$[a-zAZ ;"]+$|$[a-zAZ ]+$')
re.findall(pattern)

it returns this...

['(This is a string; "This is a thing"), '(This is also a string)']

EDIT ADDED FOR MORE INFO:

I realized there is more parenthesis above the numebred text sections that I want to omit....

"some text and stuff (some more info)
 1. yada yada yada (This is a string; "This is a thing")
 2. blah blah blah (This is also a string)"

I don't want to match "(some more info)" but I am not sure how to only include the text after the numbers (ex. 1. lskdfjlsdjfds(string I want))

Wiktor Stribiżew · Accepted Answer

You can use

\(([^);]+)

The regex demo is available here.

Note the capturing group I set with the help of unescaped parentheses: the value captured with this subpattern is returned by the re.findall method, not the whole match.

It matches

\( - a literal (
([^);]+) - matches and captures 1 or more characters other than ) or ;

Python demo:

import re
p = re.compile(r'\(([^);]+)')
test_str = "1. yada yada yada (This is a string; "This is a thing")\n2. blah blah blah (This is also a string)"
print(p.findall(test_str)) # => ['This is a string', 'This is also a string']

regex to match between certain characters

Answers (2)

Related Questions