user1561108
user1561108

Reputation: 2747

How to parse sub elements in lxml via xpath

page = urlopen(req)
doc = parse(page).getroot()
table = doc.xpath('/html/body/div/div/div/table')
table
<Element table ...>
doc.xpath('/html/body/div/div/div/table/tr')
<Element tr ...>...
table.xpath('/tr')
[]

Why doesn't table.xpath('/tr') produce the same list of elements doc.xpath('/html/body/div/div/div/table/tr') does?

Upvotes: 0

Views: 2475

Answers (1)

stranac
stranac

Reputation: 28266

That's because an xpath starting with / always starts matching at the document root.

To avoid this, either leave the slash out, or be explicit and use . to match the current element.
Either of these should work:

table.xpath('tr')
# or
table.xpath('./tr')

Upvotes: 6

Related Questions