Matnagra
Matnagra

Reputation: 59

lxml returns empty list

I have seen many solutions on the internet, but non of them seem to work.

I have this code to retrieve information from a user in Imdb:

from lxml import html
import requests

page = requests.get('http://www.imdb.com/user/ur6447592/comments-expanded?start=0&order=alpha')
tree = html.fromstring(page.content)

result = tree.xpath('//*[@id="outerbody"]/tbody/tr/td/b[2]/text()')

print(result)

The output should be:

["Little flesh and all bones"]

Upvotes: 1

Views: 655

Answers (1)

azalea
azalea

Reputation: 12610

Change xpath argument to:

'//*[@id="outerbody"]/tr/td/b[2]/text()'

Edit:

Thanks to the comments, I just realized why OP encountered the problem.

You can print page.content to see the original html. (via @JacobIRR)

Or, in Firefox, Tools - Web Developer - Page Source.

In Google Chrome Developer Tools, as is quoted from @corn3lius:

If you use the network tab and look at the document returned it will give you the original state before whoever messes with the DOM.

Upvotes: 3

Related Questions