Finding attribute value using lxml without using a for loop

Question

This is the code I have at the moment:

>>>p = []
>>>r = root.findall('.//*div/[@class="countdown closed"]/')
>>>r
''
>>>for i in r:
            s = i.attrib
            p.append(s['data-utime'])
>>>p
['1383624000']

s yields:

{'class': 'timestamp', 'data-utime': '1383624000'}

I think the code above is verbose(creating a list, using a for loop for only 1 string).

I know lxml is capable of achieving this more succinctly however I am unable to achieve this, I appreciate any assistance.

Charles Duffy · Accepted Answer

Use XPath, not the ElementTree findall() (which is a more limited and restricted language present for compatibility with the ElementTree library lxml extends), and address your path all the way down to the attribute:

root.xpath('//html:div[@class="countdown closed"]/@data-utime',
  namespaces={'html': 'http://www.w3.org/1999/xhtml'})

(It is possible to use namespace wildcards in XPath, but not great practice -- not only does it leave one open to namespace collisions, but can also be a performance impediment if your engine indexes against fully-qualified attribute names).

Finding attribute value using lxml without using a for loop

Answers (2)

Related Questions