Reputation: 351
Below is a simple program to print the names of the professors in the link : http://cse.iitkgp.ac.in/index.php?secret=d2RkOUgybWlNZzJwQXdLc28wNzh6UT09
The xpath query //font[1]/b/a/b/text()
gives the output names when tested seperately. However this program gives an empty list as output. Any ideas what I'm doing wrong here?
import sys
import requests
import lxml.html
def getdata():
v = lxml.html.document_fromstring(requests.get("http://cse.iitkgp.ac.in/index.php?secret=d2RkOUgybWlNZzJwQXdLc28wNzh6UT09").content)
profs = v.xpath('//font[1]/b/a/b/text()')
for prof in profs:
print prof
if __name__=="__main__":
getdata()
Upvotes: 2
Views: 270
Reputation: 17751
That page uses AJAX to render, i.e.: the list of elements you need is loaded via JavaScript.
This is the URL where the data is actually served:
http://cse.iitkgp.ac.in/faculty4.php?_=1451158710268
I found it by using the developer tools in Chromium, looking for XHR requests.
Upvotes: 2