Reputation: 5488
I am trying out example given in BeautifulSoup documentation and one of the example is not giving intended result
html_doc = """
<html><head><title>The Dormouse's story</title></head>
<p class="title"><b>The Dormouse's story</b></p>
<p class="story">Once upon a time there were three little sisters; and their names were
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
<p class="story">...</p>
"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_doc)
In the example it says
soup.find_all('b')
# [<b>The Dormouse's story</b>]
but when I try the same command I am getting error as below
>>> soup.find_all('b')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object is not callable
but soup object is not None
>>> soup
<html><head><title>The Dormouse's story</title></head>
<p class="title"><b>The Dormouse's story</b></p>
<p class="story">Once upon a time there were three little sisters; and their
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
<p class="story">...</p>
</html>
I am not sure why the given example is not working.
Upvotes: 1
Views: 276
Reputation: 1121206
You are using BeautifulSoup version three, not version four.
In BeautifulSoup 3, the method is called findAll()
, not find_all()
. Because using an attribute that is not recognized is translated to soup.find('unrecognized_attribute')
, you asked BeautifulSoup to find you the first <find_all>
HTML element, which doesn't exist so None
is returned.
Use BeautifulSoup 4 instead:
from bs4 import BeautifulSoup
where you almost certainly instead used:
from BeautifulSoup import BeautifulSoup # version 3
You'll need to install the beautifulsoup4
project.
Demo:
>>> html_doc = """
... <html><head><title>The Dormouse's story</title></head>
...
... <p class="title"><b>The Dormouse's story</b></p>
...
... <p class="story">Once upon a time there were three little sisters; and their names were
... <a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
... <a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
... <a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
... <p class="story">...</p>
... """
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(html_doc)
>>> soup.find_all('b')
[<b>The Dormouse's story</b>]
>>> from BeautifulSoup import BeautifulSoup
>>> soup = BeautifulSoup(html_doc)
>>> soup.find_all('b')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object is not callable
Upvotes: 3