Reputation: 1585
This should be quite straightforward but I can't quite twig it. I want to get the name from this html string:
soup = </ul>
Brian
<p class="f">
I've tried:
namePattern = re.compile(r'(?<=</ul>)(.*?)(?<=<p)')
rev.reviewerName = re.findall(namePattern, str(soup))
and
namePattern = re.compile(r'</ul>(.*?)<p')
Can you tell me how to do it? Thanks.
Upvotes: 1
Views: 1310
Reputation: 500853
By default, .
doesn't match newlines. You need to specify re.DOTALL
as the second argument to re.compile()
.
Note that this will include the newlines as part of your capture group. If you don't want that, you can explicitly match them with \s*
:
In [5]: re.findall(r'</ul>\s*(.*?)\s*<p', s)
Out[5]: ['Brian']
Upvotes: 3