Benabra
Benabra

Reputation: 309

Python & lxml / xpath: Parsing XML

I need to get the value from the FLVPath from this link : http://www.testpage.com/v2/videoConfigXmlCode.php?pg=video_29746_no_0_extsite

from lxml import html

sub_r = requests.get("http://www.testpage.co/v2/videoConfigXmlCode.php?pg=video_%s_no_0_extsite" % list[6])
sub_root = lxml.html.fromstring(sub_r.content)

for sub_data in sub_root.xpath('//PLAYER_SETTINGS[@Name="FLVPath"]/@Value'):
     print sub_data.text

But no data returned

Upvotes: 3

Views: 4898

Answers (3)

mata
mata

Reputation: 69042

You're using lxml.html to parse the document, which causes lxml to lowercase all element and attribute names (since that doesn't matter in html), which means you'll have to use:

sub_root.xpath('//player_settings[@name="FLVPath"]/@value')

Or as you're parsing a xml file anyway, you could use lxml.etree.

Upvotes: 4

Jakub Roztocil
Jakub Roztocil

Reputation: 16232

url = "http://www.testpage.com/v2/videoConfigXmlCode.php?pg=video_29746_no_0_extsite"
response = requests.get(url)

# Use `lxml.etree` rathern than `lxml.html`, 
# and unicode `response.text` instead of `response.content`
doc = lxml.etree.fromstring(response.text)

for path in doc.xpath('//PLAYER_SETTINGS[@Name="FLVPath"]/@Value'):
     print path

Upvotes: 0

Tuim
Tuim

Reputation: 2511

You could try

print sub_data.attrib['Value']

Upvotes: 2

Related Questions