Reputation: 2168
Often when making a GET request with the request
module in Node.js, the oldest version of the website's HTML is returned.
For example, a very old version of Google is returned when making a request to http://google.com. On the other hand, accessing Google in a browser returns a much more modern version of the website.
I suspect that it related to the device/browser information accessed by sites like Google. request
doesn't send any device information (from what I know).
Is there any way to trick sites into thinking that the are being accessed by an actual device/browser (and a modern one too)?
Upvotes: 0
Views: 85
Reputation: 1578
By default, the request package does not include any device information (As the question mentions). Big sites like google use this information to suit aspects of the page like HTML version, CSS/JS features. Newer user-agent means the page can use more and newer features. To emulate any specific device (To debug a mobile page, for instance), pick the appropriate user-agent at useragentstring.com.
Some other headers like accept
and accept-encoding
can also affect this (Doc here).
Try this code (taken from the docs):
var request = require('request');
var options = {
url: 'https://google.com',
headers: {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36'
}
};
function callback(error, response, body)
{
console.log(body);
}
request(options, callback);
Upvotes: 1