HTML parsing with Beautiful soup

Question

This is what my HTML looks like :



    
    
    string1
    
    string2
    
    string3
    
    
    
    
    
    
    


    
    
    string4
    
    string5
    
    string6

I want to extract strings (string1 to string6) with Beautiful soup.

Can anyone answer me how to do this?

** there are so many

s in the rest of HTML and i don't need them all. I want to extract strings between and

David Robinson · Accepted Answer

If that is in the string html, use

from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(html)
print [t.text for t in soup.find("table", {"class": "list04"}).findAll("div")]

which will print out:

[u'string1', u'string2', u'string3', u'string4', u'string5', u'string6']

HTML parsing with Beautiful soup

Answers (2)

Related Questions