SkyBlue
SkyBlue

Reputation: 293

Python - Looping through HTML Tags and using IF

I am using python to extract data from a webpage. The webpage has a reoccurring html div tag with class = "result" which contains other data in it (such as location, organisation etc...). I am able to successfully loop through the html using beautiful soup but when I add a condition such as if a certain word ('NHS' for e.g.) exists in the segment it doesn't return anything - though I know certain segments contain it. This is the code:

soup = BeautifulSoup(content)
details = soup.findAll('div', {'class': 'result'})

for detail in details:
    if 'NHS' in detail:
        print detail

Hope my question makes sense...

Upvotes: 1

Views: 6036

Answers (1)

Sheena
Sheena

Reputation: 16212

findAll returns a list of tags, not strings. Perhaps convert them to strings?

s = "<p>golly</p><p>NHS</p><p>foo</p>"
soup = BeautifulSoup(s)
details = soup.findAll('p')
type(details[0])    # prints: <class 'BeautifulSoup.Tag'>

You are looking for a string amongst tags. Better to look for a string amongst strings...

for detail in details:
    if 'NHS' in str(detail):
        print detail

Upvotes: 3

Related Questions