John Jackson
John Jackson

Reputation: 45

Scraping URLs from a web page with Node.js

I'm trying to scrape all URLs from a website and put them into an array. I have a question about an array index. If I add an index number like 2 into array[2], the command line replies with "undefined". If I remove the index and print the whole array, it prints all the URLs line by line. I want each URL to be its own index like:

Can anyone point me in the right direction? Thank you.

  var request = require('request');
    var cheerio = require('cheerio');

   var url = 'http://www.hobo-web.co.uk/';

    request(url, function(err, resp, body){
      $ = cheerio.load(body);
      links = $('a'); //use your CSS selector here
      $(links).each(function(i, link){
        var array = $(link).attr('href');
        console.log(array[2]);

      });
    });``

Upvotes: 2

Views: 1771

Answers (1)

CaribouCode
CaribouCode

Reputation: 14398

You need to initially create the array as a variable accessible within the .each loop, then keep pushing the href values to it.

var request = require('request');
var cheerio = require('cheerio');

var url = 'http://www.hobo-web.co.uk/';

var array = [];

request(url, function(err, resp, body){
  $ = cheerio.load(body);
  links = $('a');
  $(links).each(function(i, link){
    var href = $(link).attr('href');
    array.push(href);
  });
});

Upvotes: 3

Related Questions