noctonura
noctonura

Reputation: 13121

Why lxml isn't finding xpath given by Chrome inspector?

Here is my code:

from lxml import html
import requests

page = requests.get('https://en.wikipedia.org/wiki/Nabucco')
tree = html.fromstring(page.content)
title = tree.xpath('//*[@id="mw-content-text"]/table[1]/tbody/tr[1]/th/i')
print(title)

Problem: print(title) prints "[]", empty list. I expect this to print "Nabucco". The XPath expression is from Chrome inspector "Copy XPath" function.

Why isn't this working? Is there a disagreement between lxml and Chrome's xpath engine? Or am I missing something? I am somewhat new to python, lxml and xpath.

Upvotes: 5

Views: 972

Answers (1)

alecxe
alecxe

Reputation: 473873

That's because of the tbody tag. You see it in the browser since the tag was inserted by the browser. requests is not a browser and just downloads the page source as is:

Replace:

//*[@id="mw-content-text"]/table[1]/tbody/tr[1]/th/i

with:

//*[@id="mw-content-text"]/table[1]/tr[1]/th/i

Upvotes: 8

Related Questions