Python regular expression

Question

str1 = abdk3The content we need
aaaaabbbThe content we need2

We need the contents inside the h1 tag and h2 tag.

What is the best way to do that? Thanks

Thanks for the help!

Chris Morgan · Accepted Answer

The best way if it needs to scale at all would be with something like BeautifulSoup.

>>> from BeautifulSoup import BeautifulSoup
>>> soup = BeautifulSoup('abdk3The content we need
aaaaabbbThe content we need2')
>>> soup.h1
The content we need
>>> soup.h1.text
u'The content we need'
>>> soup.h2
The content we need2
>>> soup.h2.text
u'The content we need2'

It could be done with a regular expression too but this is probably more what you want. A larger example of what you are wanting could be good. Without knowing quite what you're wanting to parse it's hard to help properly.

Python regular expression

Answers (2)

Related Questions