TJ1
TJ1

Reputation: 8498

Error in using BeautifulSoup in Python: AttributeError: 'NoneType' object has no attribute 'find'

I am trying to use BeautifulSoup to get a list of Amazon best sellers. Here is the code I am trying to use:

from urllib2 import urlopen
from bs4 import BeautifulSoup
from HTMLParser import HTMLParser

def main():
    html_parser = HTMLParser()

soup = BeautifulSoup(urlopen("http://www.amazon.com/gp/bestsellers/").read())

categories = []

# Scrape list of category names and urls
for category_li in soup.find(attrs={'id':'zg_browseRoot'}).find('ul').findAll('li'):
    category = {}
    category['name'] = html_parser.unescape(category_li.a.string)
    category['url'] = category_li.a['href']

    categories.append(category)

del soup

# Loop through categories and print out each product's name, rank, and url.
for category in categories:
    print category['name']
    print '-'*50

    soup = BeautifulSoup(urlopen(category['url']))

    i = 1
    for title_div in soup.findAll(attrs={'class':'zg_title'}):
        if i ==1:
            print "%d. %s\n    %s" % (i, html_parser.unescape(title_div.a.string), title_div.a['href'].strip())
        i += 1

    print ''

if __name__ == '__main__':
    main()

When I run the code I get this error:

for category_li in soup.find(attrs={'id':'zg_browseRoot'}).find('ul').findAll('li'):
AttributeError: 'NoneType' object has no attribute 'find'

Why am I getting this error and how to resolve it? Any help is appreciated.

Upvotes: 0

Views: 1726

Answers (1)

steph
steph

Reputation: 565

Try to read the content from your second soup as well:

for category in categories:
    print category['name']
    print '-'*50

    soup = BeautifulSoup(urlopen(category['url']).read())
...

It gave me some pretty nice output.

Upvotes: 1

Related Questions