Reputation: 652
I want to get the column "Name of Menber" in website http://164.100.47.132/LssNew/Members/Alphabaticallist.aspx , so what I want the program returns is a list of "Adhalrao Patil,Shri Shivaji..", but I get an empty list. Xpath is verified in Firepath, so I just don't know what's wrong. Here is my code:
import urllib
from lxml import etree
result = urllib.urlopen("http://164.100.47.132/LssNew/Members/Alphabaticallist.aspx")
html = result.read()
parser = etree.HTMLParser()
tree = etree.parse(StringIO.StringIO(html), parser)
print type(tree)
xpath = ".//* [@id='ctl00_ContPlaceHolderMain_Alphabaticallist1_dg1']/tbody/tr[position()>1]/td[position()=3]/a/text()"
filtered_html = tree.xpath(xpath)
print filtered_html
and it returns:
[]
However, when I use another xpath:
.//*[@id='ctl00_ContPlaceHolderMain_Alphabaticallist1_dg1_ctl02_Hyperlink2']
I can get the value of the first column:
[Adhalrao Patil,Shri Shivaji]
The two xpath are both verified in firepath, Why the former cannot work?
Upvotes: 1
Views: 339
Reputation: 36262
I guess that some tags, like <tbody>
are filtered out from the html
code read by lxml
, so try without it, like:
xpath = ".//* [@id='ctl00_ContPlaceHolderMain_Alphabaticallist1_dg1']/tr[position()>1]/td[position()=3]/a/text()"
Upvotes: 2