Reputation: 153
I'm trying to scrape the temperature elements of a table from www.intellicast.com
soup = BeautifulSoup(urllib2.urlopen('http://www.intellicast.com/Local/History.aspx?location=USTX0057').read())
for row in soup('table',{'id':'dailyClimate'})[0].tbody('tr'):
tds=row
print tds
The result: TypeErrorL 'NoneType' object is not callable
When looking the the page source code i can see
<table id = "dailyClimate" class="Container">
<tbody>
<tr class="TitlesAvgRecord">
<td..
<td>...</td>
So I know there is a tbody as well as a tr element.
If I change .tbody('tr') for .tbody('td') I still get an error so I'm assuming I'm assuming the error is somewhere in calling tbody.
Upvotes: 3
Views: 1805
Reputation: 1121834
Your browser inserts a <tbody>
element, but the actual source doesn't have that element:
<table id="dailyClimate" class="Container">
<tr class="TitlesAvgRecord">
<td style="padding-left:5px;">Date</td>
<td>Average<br />Low</td>
<td>Average<br />High</td>
<td>Record<br />Low</td>
<td>Record<br />High</td>
<td>Average<br />Precipitation</td>
<td>Average<br />Snow</td>
</tr>
<!-- etc. -->
See Why do browsers insert tbody element into table elements?
You could use the html5lib
parser instead (using BeautifulSoup(source, 'html5lib')
), which would also insert the element. However, you don't need to search for it, just go straight to the <tr>
rows:
for row in soup.find('table', id='dailyClimate').find_all('tr'):
or using a CSS selector:
for row in soup.select('table#dailyClimate tr'):
You'd normally only select the tbody
element if there perhaps were more than one or there was a thead
or tfooter
element you wanted to exclude.
Upvotes: 3