Python: why does the following xpath returns empty list?

Question

I am trying to extract some text and links from instapaper.com. So I am using the following code to get the job done:

>>> import lxml.html as lh
>>> doc = lh.parse("http://www.instapaper.com/u/folder/1227370/programming")
>>> text = doc.xpath(".//*[@id='bookmark_list']/*/div[3]/a/text()")
>>> len(text)
0
>>> text
[]

As you can see it returns an empty list which means that it is not able to find any text matching the above xpath .

Now when I use the above xpath expr in firebug/firepath it works fine.

enter image description here

You can see in the above image it shows 40 matching nodes.

So, my question is why the above xpath expression is not working with python/lxml.

As requested Instapaper page source

user647772 · Accepted Answer

There is no element with the ID bookmark_list. Maybe you must be logged in.

Edit

Parsing the real HTML it works:

doc = lh.parse("http://pastebin.com/raw.php?i=1WpFAfCt")
text = doc.xpath("//*[@id='bookmark_list']/*/div[3]/a/text()")
len(text) # => 40

Python: why does the following xpath returns empty list?

Answers (1)

Related Questions