Reputation: 6538
I want to match a part of the string (a particular word) and print it. Exactly what grep -o
does.
My word is "yellow dog" for example and it can be found in a string that spans over multiple lines.
[34343] | ****. "Example": <one>, yellow dog
tstring0 123
tstring1 456
tstring2 789
Let's try this regex mydog = re.compile(', .*\n')
and then
if mydog.search(string):
print only the matched words.
How do I get only "yellow dog" in the output?
Upvotes: 25
Views: 108450
Reputation: 85775
Using a capture group and findall:
>>> import re
>>> s = """[34343] | ****. "Example": <one>, yellow dog
... tstring0 123
... tstring1 456
... tstring2 789"""
>>> mydog = re.compile(', (.*)\n')
>>> mydog.findall(s)
['yellow dog']
If you only want the first match then:
>>> mydog.findall(s)[0]
'yellow dog'
Note: you'd want to handle the IndexError
for when s
doesn't contain a match.
Upvotes: 25
Reputation: 387557
If you don’t specify a capture group, the text that is matched by the whole expression will be contained withing matchResult.group(0)
. In your case, this would be ', yellow dog\n'
. If you just want the yellow dow
, you should add a capture group to the expression: , (.*?)\n
. Note that I also changed the .*
into a .*?
so that it will be non-greedy and stop when it finds the first line break.
>>> s = '''[34343] | ****. "Example": <one>, yellow dog
tstring0 123
tstring1 456
tstring2 789'''
>>> mydog = re.compile(', (.*?)\n')
>>> matchResult = mydog.search(s)
>>> if matchResult:
print(matchResult.group(1))
yellow dog
Upvotes: 11