Reputation: 309
I know it may seem like this question has already been asked, but I've tried searching and using the other answers for my example but for some reason, I can't seem to get it working.
I have the text:
['root(ROOT-0, love-2) s1', 'amod(perve-5, good-4) s2',
'advmod(love-2, thanks-12) s3', 'amod(mags-16, glossy-15) s4']
And I only want the text in between amod( up until the -. for example, I want:
'perve' and 'mags'
I've tried:
words = re.findall('\((.*?)\-', v)
but it returns:
['ROOT', 'perve', 'love', 'mags']
Any suggestions would be greatly appreciated.
Upvotes: 0
Views: 72
Reputation: 434
When I want to find an arbitrary substring between two known substrings, I usually rely on a combination of a lookahead and lookbehind assertion.
for string in List:
match = re.search(r'(?<=amod\()[^-]+(?=-)',string).group()
print(match)
Note, that you have to use [^-]
(everything except minus), because of the lookbehind assertion (?=-)
. You can't use your greedy .+
and then expect the regex to stop matching at your lookbehind, if your lookbehind (-) is also in the greedy match (.+)
Hope this is what you wanted.
Upvotes: 0
Reputation: 785128
You may use:
>>> test_str = (" ['root(ROOT-0, love-2) s1', 'amod(perve-5, good-4) s2',\n"
... " 'advmod(love-2, thanks-12) s3', 'amod(mags-16, glossy-15) s4']")
>>>
>>> print ( re.findall(r"amod\(([^-]*)-", test_str) )
['perve', 'mags']
RegEx Details:
amod
: Match literal text amid(
([^-]*)
: Match 0 or more of any characters that are not -
and capture it in group #1-
: Match a literal -
Upvotes: 2