Reputation: 11
<table style="width:300px" border="1">
<tr>
<td>John</td>
<td>Doe</td>
<td>80</td>
</tr>
<tr>
<td>ABC</td>
<td>abcd</td>
<td>80</td>
</tr>
<tr>
<td>EFC</td>
<td>efc</td>
<td>80</td>
</tr>
</table>
I need to grab all the td's in column 2 in python.I am new to python.
import urllib2
from bs4 import BeautifulSoup
url = "http://ccdsiu.byethost33.com/magento/adamo-13.html"
text = urllib2.urlopen(url).read()
soup = BeautifulSoup(text)
data = soup.findAll('div',attrs={'class':'madhu'})
for div in data:
trdata = div.findAll('tr')
tddata = div.findAll('td')
for trr in trdata:
print trr
I am trying to get data from above code .It is printing all the td elements in table.I am trying to achieve this by Xpath
Upvotes: 0
Views: 1986
Reputation: 8786
It is not clear really what you want since your example of html is not relevant and the description of just second column tds
isnt really helpful. Anyway I modified Elmos answer to give you the Importance title and then the actual importance level of each thing.
for div in data:
trdata = div.findAll('tr')
tddata = div.findAll('td')
count = 0
for i in range(0, len(tddata)):
if count % 6 == 0:
print tddata[count + 1]
count += 1
Upvotes: 0
Reputation: 54551
I don't think you can use xpath like you mentioned with BeautifulSoup. However, the lxml
module, which comes with python, can do it.
from lxml import etree
table = '''
<table style="width:300px" border="1">
<tr>
<td>John</td>
<td>Doe</td>
<td>80</td>
</tr>
<tr>
<td>ABC</td>
<td>abcd</td>
<td>80</td>
</tr>
<tr>
<td>EFC</td>
<td>efc</td>
<td>80</td>
</tr>
</table>
'''
parser = etree.HTMLParser()
tree = etree.fromstring(table, parser)
results = tree.xpath('//tr/td[position()=2]')
print 'Column 2\n========'
for r in results:
print r.text
Which when run prints
Column 2
========
Doe
abcd
efc
Upvotes: 1
Reputation: 11290
You don't have to iterate over your td
elements. Use this:
for div in data:
trdata = div.findAll('tr')
tddata = div.findAll('td')
if len(tddata) >= 2:
print tddata[1]
Lists are indexed starting from 0. I check the length
of the list
to make sure that second td
exists.
Upvotes: 0