Reputation: 13
I am using Python and requests library to do web-scraping. I've got a problem with the loading of a page, I would like to make the requests.get() wait before getting the result.
I saw some people with the same "problem" they resolved using Selenium, but I don't want to use another API. I am wondering if it's possible using only urllib, urllib2 or requests.
I have tried to put time.sleep() in the get method, it didn't work. It seems that I need to find where the website get the data before showing it but I can't find it.
import requests
def search():
url= 'https://academic.microsoft.com/search?q=machine%20learning'
mySession = requests.Session()
response = mySession.get(url)
myResponse = response.text
The response is the html code of the loading page (you can see it if you go to the link in the code) with the loading blocks but I need to get the results of the research.
Upvotes: 1
Views: 8554
Reputation: 1263
requests
does not load elements that are supposed to be loaded dynamically via Ajax requests. See this definition from w3schools.com.
Read data from a web server - after a web page has loaded
The only thing requests
do is to download the HTML content, but it does not interpret the javascript code inside the web page with the Ajax requests instructions. So it does not load elements that are normally loaded via Ajax in a web browser (or using Selenium).
Upvotes: 4
Reputation: 1937
This site is making another requests and using javascript to render it. You cannot execute javascript with requests
. That's why some people use Selenium
.
https://academic.microsoft.com/search?q=machine%20learning is not meant to by used without browser.
If you want data specifically from academic.microsoft.com
use their api.
import requests
url = 'https://academic.microsoft.com/api/search'
data = {"query": "machine learning",
"queryExpression": "",
"filters": [],
"orderBy": None,
"skip": 0,
"sortAscending": True,
"take": 10}
r = requests.post(url=url, json=data)
result = r.json()
You will get data in nice format and easy to use.
Upvotes: 0