ritz301
ritz301

Reputation: 351

Python lxml xpath not working

Below is a simple program to print the names of the professors in the link : http://cse.iitkgp.ac.in/index.php?secret=d2RkOUgybWlNZzJwQXdLc28wNzh6UT09

The xpath query //font[1]/b/a/b/text() gives the output names when tested seperately. However this program gives an empty list as output. Any ideas what I'm doing wrong here?

import sys
import requests
import lxml.html

def getdata():
    v = lxml.html.document_fromstring(requests.get("http://cse.iitkgp.ac.in/index.php?secret=d2RkOUgybWlNZzJwQXdLc28wNzh6UT09").content)
    profs = v.xpath('//font[1]/b/a/b/text()')
    for prof in profs:
        print prof

if __name__=="__main__":
    getdata()

Upvotes: 2

Views: 270

Answers (1)

Andrea Corbellini
Andrea Corbellini

Reputation: 17751

That page uses AJAX to render, i.e.: the list of elements you need is loaded via JavaScript.

This is the URL where the data is actually served:

http://cse.iitkgp.ac.in/faculty4.php?_=1451158710268

I found it by using the developer tools in Chromium, looking for XHR requests.

Upvotes: 2

Related Questions