Reputation: 285
I'm learning web scraping using python.
Here is my first python code
# encoding=utf8
import urllib2
from bs4 import BeautifulSoup
soup = BeautifulSoup(urllib2.urlopen("http://www.bcsfootball.org/").read(),"lxml")
for row in soup("table", {'class': "mod-data"})[0].tbody("tr"):
tds = row('td')
print tds[0].string, tds[1].string
I'm getting error
/usr/bin/python2.7 /home/NewYork/PycharmProjects/untitled/News.py
Traceback (most recent call last):
File "/home/NewYork/PycharmProjects/untitled/News.py", line 8, in <module>
for row in soup("table", {'class': "mod-data"})[0].tbody("tr"):
IndexError: list index out of range
Can anyone help me what am doing wrong ?
And one more thing I would like to ask ...please help me to understand what is happening here exactly...
for row in soup("table", {'class': "mod-data"})[0].tbody("tr"):
Thanks !! :)
Upvotes: 1
Views: 787
Reputation: 3516
This would give you the expected result:
import urllib2
from bs4 import BeautifulSoup
soup = BeautifulSoup(urllib2.urlopen("http://www.bcsfootball.org").read(),"html")
welcome = soup("div", {'class': "col-full"})[1] # we know it's index 1
for item in welcome:
print item.string
Upvotes: 0
Reputation: 21
The error message means soup("table", {'class': "mod-data"})
is an empty list, but you want to get the first element in this list.
You should ensure the table
element has a node using class "mod-data"
.
Upvotes: 1