Ahsan Jamal
Ahsan Jamal

Reputation: 256

Unable to receive proper data from the promise function

I am trying to scrap wikipedia page to fetch list of airlines by first scrapping first page and then going to each individual page of airline to get the website url. I have divided the code in two functions. One to scrap main page and get a new url, and second function to scrap another page from the created url to get the website name from that page. I have used request-promise module for getting the html and then cheerio to parse the data.

export async function getAirlinesWebsites(req,res) {

let response = await request(options_mainpage);
console.log(`Data`);

let $ = cheerio.load(response);
console.log('Response got');

 $('tr').each((i,e)=>{
     let children = '';
    console.log('inside function ', i);
        if($(e).children('td').children('a').attr('class') !== 'new') {
            children = $(e).children('td').children('a').attr('href');


            let wiki_url = 'https://en.wikipedia.org' + children;
            console.log(`wiki_url = ${wiki_url}`);

             let airline_url = getAirlineUrl(wiki_url);
             console.log(`airline_url = ${airline_url}`);
        }
})

And then the getAirlineUrl() function will parse another page based on the provided url.

async function getAirlineUrl(url){

    const wiki_child_options = {
        url : url,
        headers : headers
    }


   let child_response = await request(wiki_child_options);
        let $ = cheerio.load(child_response);

        let answer = $('.infobox.vcard').children('tbody').children('tr').children('td').children('span.url').text();

        return answer;

    })

However when I console log the answer variable in the parent function, I get a [object Promise] value instead of a String. How do I resolve this issue?

Upvotes: 0

Views: 175

Answers (2)

loganfsmyth
loganfsmyth

Reputation: 161457

Since your getAirlineUrl function returns a promise, you need to await that promise. You can't have await nested inside of the .each callback because the callback is not an async function, and if it was it wouldn't work still. The best fix is the avoid using .each and just use a loop.

export async function getAirlinesWebsites(req,res) {

  let response = await request(options_mainpage);
  console.log(`Data`);

  let $ = cheerio.load(response);
  console.log('Response got');

  for (const [i, e] of Array.from($('tr')).entries()) {
    let children = '';
    console.log('inside function ', i);
    if($(e).children('td').children('a').attr('class') !== 'new') {
      children = $(e).children('td').children('a').attr('href');


      let wiki_url = 'https://en.wikipedia.org' + children;
      console.log(`wiki_url = ${wiki_url}`);

      let airline_url = await getAirlineUrl(wiki_url);
      console.log(`airline_url = ${airline_url}`);
    }
  }
}

Upvotes: 0

error404
error404

Reputation: 331

Async function return promise.In case of that,you need to use then to get resolved response or use await. This should work if other part of your code is ok.

export async function getAirlinesWebsites(req, res) {
  let response = await request(options_mainpage);
  console.log(`Data`);

  let $ = cheerio.load(response);
  console.log("Response got");

  $("tr").each(async (i, e) => {
   let children = "";
   console.log("inside function ", i);
   if ($(e).children("td").children("a").attr("class") !== "new") {
     children = $(e).children("td").children("a").attr("href");

     let wiki_url = "https://en.wikipedia.org" + children;
     console.log(`wiki_url = ${wiki_url}`);

     let airline_url = await getAirlineUrl(wiki_url);
     console.log(`airline_url = ${airline_url}`);
   }
 });
}

Upvotes: 1

Related Questions