Reputation: 721

Python BeautifulSoup: 'list_iterator' object is not subscriptable

I'm trying to extract the text inside from the following html structure:

<div class="account-age">
    <label></label>
    <div>
        <div>
             <span>Text to extract</span>
        </div>
    </div>
</div>

I have the following Beautiful Soup code to do it:

from bs4 import BeautifulSoup as bs

soup = bs(html, "lxml")
div = soup.find("div", {"class": "account-age"})
span = div.children[1].children[0].children[0]
text = span.get_text()

Unfortunately, Beautiful Soup is throwing the error: 'list_iterator' object is not subscriptable. How can I fix this to extract the text I need?

Upvotes: 3

Answers (4)

Ishara Dayarathna

Reputation: 3601

Try this:

from bs4 import BeautifulSoup as bs
html ='''<div class="account-age">
    <label></label>
    <div>
        <div>
             <span>Text to extract</span>
        </div>
    </div>
</div>'''
soup = bs(html, 'html.parser')
div = soup.find("div", {"class": "account-age"})
span = div.find('span')
text = span.get_text()
print(text)

Result:

Text to extract

Upvotes: 0

neurite

Reputation: 2824

The property children is an generator. As the error says, it is not subscriptable. To get a list, use contents instead:

div.contents[1].contents[0].contents[0]

See documentation.

Upvotes: 1

Martin Evans

Reputation: 46759

First locate the div, and then access the span text using an attribute as follows:

from bs4 import BeautifulSoup as bs

html = """<div class="account-age">
    <label></label>
    <div>
        <div>
             <span>Text to extract</span>
        </div>
    </div>
</div>"""

soup = bs(html, "lxml")
div = soup.find('div', class_='account-age')
print(div.span.text)

This would display:

Text to extract

Upvotes: 0

akuiper

Reputation: 214987

You might do this by directly chaining the tags from the root div:

div.div.div.span.get_text()
# u'Text to extract'

Upvotes: 2

Python BeautifulSoup: &#39;list_iterator&#39; object is not subscriptable

Answers (4)

Related Questions

Python BeautifulSoup: 'list_iterator' object is not subscriptable