JVG
JVG

Reputation: 21170

Getting current URL from NodeJS's REQUEST module

I'm using nodeJS and the request module. I'm trying to scrape data from a web page, but my data comes from an API which only gives me link-tracking urls.

For instance, this link:

http://www.kqzyfj.com/click-7227532-11292048?url=http%3A%2F%2Fwww.urbanoutfitters.com%2Furban%2Fcatalog%2Fproductdetail.jsp%3Fid%3D27074590

Actually leads here:

http://www.urbanoutfitters.com/urban/catalog/productdetail.jsp?id=27074590&cm_mmc=CJ-_-Affiliates-_-Threadfinder-_-11292048

I'm aware that most of the link is embedded in the original URL, but this isn't always the case, so please ignore it / don't post answers which suggest regex'ing my way out of this!

Using Request, how can I grab the page's URL (that is, the second link that the first redirects to) and store it as a variable?

Upvotes: 0

Views: 2331

Answers (2)

shawnzhu
shawnzhu

Reputation: 7585

Checkout the line #77 of request.js:

It provides a internal array in response object named redirects:

var request = require('request');
var url = "http://www.kqzyfj.com/click-7227532-11292048?url=http%3A%2F%2Fwww.urbanoutfitters.com%2Furban%2Fcatalog%2Fproductdetail.jsp%3Fid%3D27074590";

request(url, function (error, response, body) {
  if (!error && response.statusCode == 200) {
    console.log("%j", response['request']['redirects'])
  }
})

Then you can find JSON representation of an array with redirect history including status code and redirect URL. (I found there're 3 redirects out of the URL you've provided)

Upvotes: 0

levi
levi

Reputation: 25161

This should do it:

request(url, function(err, res, body){
    // get final redirect url
    if(this.redirects.length){
        var destUrl = this.redirects[this.redirects.length-1].redirectUri;
        console.log(destUrl);
    }
});

Upvotes: 1

Related Questions