Ernesto561
Ernesto561

Reputation: 577

Parsing a table using Beautiful soup

I have been struggling with Beautiful Soup and a web page. I want to parse a specific table from a web page, but I have had problems. My code is the following:

# -*- coding: cp1252 -*-
import urllib2

from bs4 import BeautifulSoup

page    =     urllib2.urlopen("http://www.snet.gob.sv/googlemaps/workstation/main.php").read()
soup    = BeautifulSoup(page)


data = []
table = soup.find("table", { "class" : "mytable" })
table_body = table.find('tbody')

rows = table_body.find_all('tr')
for row in rows:
    cols = row.find_all('td')
    cols = [ele.text.strip() for ele in cols]
    data.append([ele for ele in cols if ele]) # Get rid of empty values

print data

It works with another web pages, but not with this one. I get the following error:

table_body = table.find('tbody')
AttributeError: 'NoneType' object has no attribute 'find'

It seems it does not find the tag "tbody", but I have checked and it is in the code. Another problem is that when it works (other web pages), a "u" is next to every item of the table. I have searched a lot and I cannot find the problem. Thanks for your help.

Upvotes: 0

Views: 996

Answers (1)

Anand S Kumar
Anand S Kumar

Reputation: 91007

No, the error -

AttributeError: 'NoneType' object has no attribute 'find'

indicates that table is None , which means that the function -

soup.find("table", { "class" : "mytable" })

returned None , which indicates that the page does not have any table with property class having value - mytable .

You cannot just assume that html across different webpages would be exactly the same (otherwise all webpages would have looked exactly the same) .

I checked the url, and there really are no tables with that class, no table has any class at all in that particular page. You would need to decide which table you want to find and give the conditions accordingly.

Upvotes: 1

Related Questions